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Basic Concepts in Graph Theory 


Section 1: What is a Graph? 


There are various types of graphs, each with its own definition. Unfortunately, some 
people apply the term “graph” rather loosely, so you can’t be sure what type of graph 
they’re talking about unless you ask them. After you have finished this chapter, we expect 
you to use the terminology carefully, not loosely. To motivate the various definitions, we’ll 
begin with some examples. 


Example 1 (A computer network) Computers are often linked with one another so 
that they can interchange information. Given a collection of computers, we would like to 
describe this linkage in fairly clean terms so that we can answer questions such as “How can 
we send a message from computer A to computer B using the fewest possible intermediate 
computers?” 


We could do this by making a list that consists of pairs of computers that are connected. 
Note that these pairs are unordered since, if computer C can communicate with computer 
D, then the reverse is also true. (There are sometimes exceptions to this, but they are 
rare and we will assume that our collection of computers does not have such an exception.) 
Also, note that we have implicitly assumed that the computers are distinguished from each 
other: It is insufficient to say that “A PC is connected to a Mac.” We must specify which 
PC and which Mac. Thus, each computer has a unique identifying label of some sort. 


For people who like pictures rather than lists, we can put dots on a piece of paper, 
one for each computer. We label each dot with a computer’s identifying label and draw a 
curve connecting two dots if and only if the corresponding computers are connected. Note 
that the shape of the curve does not matter (it could be a straight line or something more 
complicated) because we are only interested in whether two computers are connected or 
not. Below are two such pictures of the same graph. Each computer has been labeled by 
the initials of its owner. 


EN 


™ 
SH RL cS 


MN SE EN MN SE 


SH RL 
SM Lo SM 


CS 


Computers (vertices) are indicated by dots (e) with labels. The connections (edges) are 
indicated by lines. When lines cross, they should be thought of as cables that lie on top of 
each other — not as cables that are joined. O 
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The notation P;,(V) stands for the set of all k-element subsets of the set V. Based on 
the previous example we have 


Definition 1 (Simple graph) A simple graph G is a pair G = (V, E) where 
e V is a finite set, called the vertices of G, and 


e E is asubset of P2(V) (i.e., a set E of two-element subsets of V ), called the edges 
of G. 


In our example, the vertices are the computers and a pair of computers is in & if and only 
if they are connected. 


Example 2 (Routes between cities) Imagine four cities named, with characteristic 
mathematical charm, A, B,C and D. Between these cities there are various routes of travel, 
denoted by a,b,c,d,e, f and g. Here is picture of this situation: 


a 


Cc D 


Looking at this picture, we see that there are three routes between cities B and C. These 
routes are named d,e and f. Our picture is intended to give us only information about 
the interconnections between cities. It leaves out many aspects of the situation that might 
be of interest to a traveler. For example, the nature of these routes (rough road, freeway, 
rail, etc.) is not portrayed. Furthermore, unlike a typical map, no claim is made that 
the picture represents in any way the distances between the cities or their geographical 
placement relative to each other. The object shown in this picture is called a graph. 


Following our previous example, one is tempted to list the pairs of cities that are 
connected; in other words, to extract a simple graph from the information. Unfortunately, 
this does not describe the problem adequately because there can be more than one route 
connecting a pair of cities; e.g., d, e and f connecting cities B and C in the figure. How 
can we deal with this? Here is a precise definition of a graph of the type required to handle 
this type of problem. QJ 


Definition 2 (Graph) A graph is a triple G = (V, E,¢) where 
e V is a finite set, called the vertices of G, 


e FE is a finite set, called the edges of G, and 


e ¢ is a function with domain E and codomain P2(V). 
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In the pictorial representation of the cities graph, G = (V, E,¢) where 
V ={A,B,C,D}, E = {a,b,c,d,e, f,g} 


and 


6=( a b Cc d € a g ) 
~ \ {A,B} {A,B} {A,C} {B,C} {B,C} {B,C} {B,D}]° 


Definition 2 tells us that to specify a graph G it is necessary to specify the sets V and E 
and the function ¢. We have just specified V and @¢ in set theoretic terms. The picture of 
the cities graph specifies the same V and ¢ in pictorial terms. The set V is represented 
clearly by dots (e), each of which has a city name adjacent to it. Similarly, the set E is also 
represented clearly. The function ¢ is determined from the picture by comparing the name 
attached to a route with the two cities connected by that route. Thus, the route name d is 
attached to the route with endpoints B and C. This means that ¢(d) = {B,C}. 


Note that, since part of the definition of a function includes its codomain and domain, @ 
determines P2(V) and EF. Also, V can be determined from P2(V). Consequently, we could 
have said that a graph is a function ¢ whose domain is a finite set and whose codomain is 
P2(V) for some finite set V. Instead, we choose to specify V and EF explicitly because the 
vertices and edges play a fundamental role in thinking about a graph G. 


The function ¢ is sometimes called the incidence function of the graph. The two 
elements of ¢(x) = {u,v}, for any x € E, are called the vertices of the edge x, and we 
say u and v are joined by x. We also say that u and v are adjacent vertices and that u is 
adjacent to v or , equivalently, v is adjacent to u. For any v € V, if v is a vertex of an edge 
x then we say 2 is incident on v. Likewise, we say v is a member of x, v is on 2, or v is in 
x. Of course, v is a member of x actually means v is a member of ¢(z). 


Here are two additional pictures of the same cities graph given above: 


The drawings look very different but exactly the same set V and function ¢ are specified 
in each case. It is very important that you understand exactly what information is needed 
to completely specify the graph. When thinking in terms of cities and routes between them, 
you naturally want the pictorial representation of the cities to represent their geographical 
positioning also. If the pictorial representation does this, that’s fine, but it is not a part of 
the information required to define a graph. Geographical location is extra information. The 
geometrical positioning of the vertices A,B,C and D is very different, in the first of the 
two pictorial representations above, than it was in our original representation of the cities. 
However, in each of these cases, the vertices on a given edge are the same and hence the 
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graphs specified are the same. In the second of the two pictures above, a different method 
of specifying the graph is given. There, ¢~', the inverse of ¢, is given. For example, 
¢ 1({C, B}) is shown to be {d,e, f}. Knowing ¢~! determines ¢ and hence determines G 
since the vertices A, B,C and D are also specified. 


Example 3 (Loops) A loop is an edge that connects a vertex to itself. Graphs and 
simple graphs as defined in Definitions 1 and 2 cannot have loops. Why? Suppose e € EF is 
a loop in a graph that connects v € V to itself. Then ¢(e) = {v,v} = {v} because repeated 
elements in the description of a set count only once — they’re the same element. Since 
{v} € P2(V), the range of ¢, we cannot have ¢(e) = {v, v}. In other words, we cannot have 
a loop. 


Thus, if we want to allow loops, we will have to change our definitions. For a graph, 
we expand the codomain of ¢ to be P2(V) UP\(V). For a simple graph we need to change 
the set of allowed edges to include loops. This can be done by saying that FE is a subset 
of P2(V) UPi(V) instead of a subset of just Po(V). For example, if V = {1,2,3} and 
E = {{1, 2}, {2}, {2,3}}, this simple graph has a loop at vertex 2 and vertex 2 is connected 
by edges to the other two vertices. When we want to allow loops, we speak of a graph with 
loops or a simple graph with loops. 


Examples of graphs with loops appear in the exercises. OJ 


We have two definitions, Definition 1 (simple graph) and Definition 2 (graph). How 
are they related? Let G = (V,E) be a simple graph. Define ¢: EF — E to be the identity 
map; i.e., d(e) = e for alle € FE. The graph G’ = (V, EF, ¢) is essentially the same as G. 
There is one subtle difference in the pictures: The edges of G are unlabeled but each edge 
of G’ is labeled by a set consisting of the two vertices at its ends. But this extra information 
is contained already in the specification of G. Thus, simple graphs are a special case of 
graphs. 


Definition 3 (Degrees of vertices) Let G=(V,E,#) bea graph and v € V a vertex. 
Define the degree of v, d(v) to be the number of e € E such that v € ¢(e); i.e., e is incident 
on v. 

Suppose |V| = n. Let dy, d2,...,dn, where dy < dy <--- <d, be the sequence of degrees 
of the vertices of G, sorted by size. We refer to this sequence as the degree sequence of the 
graph G. 


In the graph for routes between cities, d(A) = 3, d(B) = 6, d(C) = 4, and d(D) = 1. The 
degree sequence is 1,3,4,6. 


Sometimes we are interested only in the “structure” or “form” of a graph and not in 
the names (labels) of the vertices and edges. In this case we are interested in what is called 
an unlabeled graph. A picture of an unlabeled graph can be obtained from a picture of a 
graph by erasing all of the names on the vertices and edges. This concept is simple enough, 
but is difficult to use mathematically because the idea of a picture is not very precise. 


The concept of an equivalence relation on a set is an important concept in mathe- 
matics and computer science. We'll explore it here and will use it to develop an intuitive 
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understanding of unlabeled graphs. Later we will use it to define connected components 
and biconnected components. Equivalence relations are discussed in more detail in A Short 
Course in Discrete Mathematics, the text for the course that precedes this course. 


Definition 4 (Equivalence relation) An equivalence relation on a set S is a partition 
of S. We say that s,t € S are equivalent if and only if they belong to the same block 
(called an equivalence class in this context) of the partition. If the symbol ~ denotes the 
equivalence relation, then we write s ~ t to indicate that s and t are equivalent. 


Example 4 (Equivalence relations) Let S be any set and let all the blocks of the 
partition have one element. Two elements of S are equivalent if and only if they are the 
same. This rather trivial equivalence relation is, of course, denoted by “=”. 


Now let the set be the integers Z. Let’s try to define an equivalence relation by saying 
that n and k are equivalent if and only if they differ by a multiple of 24. Is this an 
equivalence relation? If it is we should be able to find the blocks of the partition. There are 
24 of them, which we could number 0,...,23. Block j consists of all integers which equal 
j plus a multiple of 24; that is, they have a remainder of 7 when divided by 24. Since two 
numbers belong to the same block if and only if they both have the same remainder when 
divided by 24, it follows that they belong to the same block if and only if their difference 
gives a remainder of 0 when divided by 24, which is the same as saying their difference is 
a multiple of 24. Thus this partition does indeed give the desired equivalence relation. 


Now let the set be Z x Z*, where Z* is the set of all integers except 0. Write (a,b) ~ 
(c,d) if and only if ad = bc. With a moment’s reflection, you should see that this is a way 
to check if the two fractions a/b and c/d are equal. We can label each equivalence class 
with the fraction a/b that it represents. In an axiomatic development of the rationals from 
the integers, one defines a rational number to be just such an equivalence class and proves 
that it is possible to add, subtract, multiply and divide equivalence classes. 


Suppose we consider all functions S = m@. We can define a partition of S' in a number 
of different ways. For example, we could partition the functions f into blocks where the 
sum of the integers in the Image(f) is constant, where the max of the integers in Image(f) 
is constant, or where the “type vector” of the function, namely, the number of 1’s, 2’s, etc. 
in Image(f), is constant. Each of these defines a partition of S. O 


In the next theorem we provide necessary and sufficient conditions for an equivalence 
relation. Verifying the conditions is a useful way to prove that some particular situation is 
an equivalence relation. Recall that a binary relation on a set S is a subset R of S x S. 


Theorem 1 (Equivalence Relations) Let S be a set and suppose that we have a 
binary relation R C S x S. We write s ~ t whenever (s,t) € R. This is an equivalence 
relation if and only if the following three conditions hold. 


(i) (Reflexive) For all s € S we have s ~ s. 
(ii) (Symmetric) For all s,t € S such that s ~ t we havet ~ s. 


(iii) (Transitive) For all r,s,t € S such that r ~~ s and s ~t we have r ~ t. 
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Proof: We first prove that an equivalence relation satisfies (i)—(iii). Suppose that ~ is an 
equivalence relation. Since s belongs to whatever block it is in, we have s ~ s. Since s ~ t 
means that s and ¢ belong to the same block, we have s ~ ¢ if and only if we have t ~ s. 
Now suppose that r ~ s ~ t. Then r and s are in the same block and s and ¢ are in the 
same block. Thus r and t are in the same block and so r ~ t. 


We now suppose that (i)—(iii) hold and prove that we have an equivalence relation. 
What would the blocks of the partition be? Everything equivalent to a given element 
should be in the same block. Thus, for each s € S let B(s) be the set of all t € S such that 
s~t. We must show that the set of these sets form a partition of S. 


In order to have a partition of S, we must have 
(a) the B(s) are nonempty and every t € S is in some B(s) and 
(b) for every p,q € S, B(p) and B(q) are either equal or disjoint. 


Since ~ is reflexive, s € B(s), proving (a). Suppose x € B(p)M B(q) and y € B(p). We 
have, p~2,q~axandp~y. Thusq~az~p~yand so y € B(q), proving that 
B(p) C B(q). Similarly B(q) C B(p) and so B(p) = B(q). This proves (b). O 


Example 5 (Equivalent forms) Consider the following two graphs, represented by 
pictures: 


Now let’s remove all symbols representing edges and vertices. What we have left are 
two “forms” on which the graphs were drawn. You can think of drawing a picture of a 
graph as a two step process: (1) draw the form; (2) add the labels. One student referred to 
these forms as “ghosts of departed graphs.” Note that form F, and form Fy, have a certain 
eerie similarity (appropriate for ghosts). 


AD 
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If you use your imagination a bit you can see that form F, can be transformed into form Fy, 
by sliding vertices around and bending, stretching, and contracting edges as needed. The 
edges need not be detached from their vertices in the process and edges and vertices, while 
being moved, can pass through each other like shadows. Let’s refer to the sliding, bending, 
stretching, and contracting process as “morphing” form F), into Fy. Morphing is easily seen 
to define an equivalence relation ~ on the set of all forms. Check out reflexive, symmetric, 
and transitive, for the morphing relation ~. By Theorem 1, the morphing equivalence 
relation partitions the set of all forms of graphs into blocks or equivalence classes. This is 
a good example where it is easier to think of the relation ~ than to think globally of the 
partition of the forms. 


Now suppose we have any two graphs, Ga = (Va, Ea, a) and Gy = (Vs, Ev, dy). Think 
of these graphs not as pictures, but as specified in terms of sets and functions. Now choose 
forms F, and Fy for Gg and Gy respectively, and draw their pictures. We leave it to your 
intuition to accept the fact that either Fy, ~ Fy, no matter what you choose for F, and Fy, 
or F,, % Fy no matter what your choice is for the forms F, and Fy. If F, ~ Fy we say that 
G, and Gp are isomorphic graphs and write Gg ~ Gy. The fact that ~ is an equivalence 
relation forces © to be an equivalence relation also. In particular, two graphs G, and Gy 
are isomorphic if and only if you can choose any form F, for drawing Gz and use that same 
form for G,. O 


In general, deciding whether or not two graphs are isomorphic can be very difficult 
business. You can imagine how hard it would be to look at the forms of two graphs with 
thousands of vertices and edges and deciding whether or not those forms are equivalent. 
There are no general computer programs that do the task of deciding isomorphism well. For 
graphs known to have special features, isomorphism of graphs can sometimes be decided 
efficiently. In general, if someone presents you with two graphs and asks you if they are 
isomorphic, your best answer is “no.” You will be right most of the time. 


*Random Graphs 


We now look briefly at a subject called random graphs. They often arise in the analysis 
of graphical algorithms and of systems which can be described graphically (such as the 
web). There are two basic ways to describe random graphs. One is to let the probability 
space be the set of all graphs with, for example, n vertices and q edges and use the uniform 
distribution. The other, which is often easier to study is described in the following definition. 
It is the one we study here. 


*Definition 5 (Random graph model) Let G(n,p) be the probability space obtained 
by letting the elementary events be the set of all n-vertex simple graphs with V = n. If 
G€G(n,p) has m edges, the P(G) = p™q’-™ where q =1-—p and N = ap 


We need to show that G(n, p) is a probability space. There is a nice way to see this by 
reinterpreting P. List the N = (5) vertices P2(V) in lex order. Let the sample space be 
U = x {choose, reject} with P(a,,...,an) = P*(a,) x--- x P*(an) where P* (choose) = p 
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and P*(reject) = 1 — p. We’ve met this before in Unit Fn and seen that it is a probability 
space. To see that it is, note that P > 0 and 


SS) Plai,...,av)= >> P*(ai) x +++ x P*(an) 


di sew ; (Sn) a os P*(ax)) 
er ee re pee ae eee 


Why is this the same as the definition? Think of the chosen pairs as the edges of 
a graph chosen randomly from G(n,p). If G has m edges, then its probability should be 
p'™(1—p)N-™ according to the definition. On the other hand, since G has m edges, exactly 
m of a1,...,an equal “choose” and so, in the new space, P(a,,...,an) = p™(1—p)N-™ 
also. We say that we are choosing the edges of the random graph independently. 


*Example 6 (The number of edges in random graph) Let X bea random variable 
that counts the number of edges in a random graph in G(n,p). What are the expected 
value and variance of X? In U = x {choose, reject}, let 


1 if a; = choose, 
ANON SN )= . if a; = reject. 


You should be able to see that X = X,+---+Xy and that the X; are independent random 
variables with P(X; = 1) = p. This is just the binomial distribution (Unit Fn). We showed 
that the mean is Np and the variance is Npq, where N = (i) andg=1-p. O 


*Example 7 (Triangles in random graphs) How often can we find 3 vertices {u,v, w} 
in a random graph so that {u,v}, {u,w}, and {v,w} are all edges of the graph? In other 
words, how often can we find a “triangle”? How can we do this? 


First, we need a sample space. It will be the random graph space introduced in 
Definition 5. Since we want to count something (triangles), we need a random variable. 
Let X be a random variable whose value is the number of triples of vertices such that the 
three possible edges connecting them are present in the random graph. In other words, X 
is defined for each graph, G, and its value, X(G), is the number of triangles in the graph 
G. We want to compute E(X). It would also be nice to compute Var(X) since that gives 
us some idea of how much X tends to vary from graph to graph — large Var(X) means 
there tends to be a lot of variation in the number of triangles from graph to graph and 
small Var(X) means there tends to be little variation. 


Let Xu,v,w be a random variable which is 1 if the triangle with vertices {u,v,w} is 
present and 0 otherwise. Then X is the sum of Xy,y,~ over all {u,v,w} € P3(V). Since 
expectation is linear, E(X) is the sum of E(Xy,y,w) over all {u,v,w} € P3(V). Clearly 
E(Xu,v,w) does not depend on the particular triple. Since there are (3) possibilities for 
{u,v, wh, E(X) = (3) E(X1,2,3)- 


We want to compute E(.Xj,2.3). It is given by 


E(X12,3) = OP(Xi23=0)+1P(Xi123=1) = P(Xi23 = 1). 
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The only way Xj,2,3 = 1 can happen is for the edges {1,2}, {1,3}, and {2,3} to all be 
present in the graph. (We don’t care about any of the other possible edges.) Since each of 
these events has probability p and the events are independent we have P(X1,2.3 = 1) =p. 
Thus E(X1,2,3) = p® and so E(X) = (%)p®. In other words, on average we see about (5)p® 
triangles. For example, if p = 1/2 all graphs are equally likely (You should show this.) and 
so the average number of triangles over all graphs with n vertices is (3) /8. When n = 5, 
this average is 1.25. Can you verify this by looking at all the 5-vertex graphs? How much 
work is involved? 

What happens when n is very large? Then () = ana in=2) “behaves like” n°/6. 
(“Behaves like” means that, as n goes to infinity, the limit of the ratio (4) /(n*/6) is 1.) 


3 
Thus the expected number of triangles behaves like (np)?/6. 


What about the variance? We’ll work it out in the next example. For now, we'll 
simply tell you that it behaves like n*p?(1 — p?)/2. What does this tell us for large n? 
The standard deviation behaves like n?p3/?,/(1 — p?)/2. A more general version of the 
central limit theorem than we have discussed tells us the number of triangles tends to have 
a normal distribution with p= (np)?/6 and o = n?p3/?,/(1 — p?)/2. If p is constant, o will 
grow like a constant times n?, which is much smaller than py for large n. Thus the number 
of triangles in a random graph is almost surely close to (np)?/6. O 


*Example 8 (Variance for triangles in random graphs) This is a continuation of the 
previous example. Since the various X,,,,,.~ may not be independent, this is harder. Since 
Var(X) = E(X?) — E(X)?, we will compute E(X7). Since X is a sum of terms of the form 
Corre X? is a sum of terms of the form Xuv,wXa,b,c: Using linearity of expectation, we 
need to compute E(Xu,»,wXa,b,c) for each possibility and add them up. 


Now for the tricky part: This expectation depends on how many vertices {u,v,w} and 
{a,b,c} have in common. 


e If {u,v,w} = {a,b,c}, then XuvwXa,b,c = Xu,v,w and its expectation is p* by the 
previous example. 


e If {u,v,w} and {a, b,c} have two vertices in common, then the two triangles have only 
5 edges total because they have a common edge. Note that Xy».wXa,b,c is 1 if all 
five edges are present and is 0 otherwise. Reasoning as in the previous example, the 
expectation is p. 


e If {u,v,w} and {a, b,c} have less than two vertices in common, we are concerned about 
six edges and obtain p® for the expectation. 


To add up the results in the previous paragraph, we need to know how often each 
occurs in 


x? > ( Ss" sis) ( SS Xe] = Ss" Aaa be F 


{u,v,w}EP3(V) {a,b,c}EP3(V) {u,v,w}EP3(V) 
{a,b,c}EP3(V) 


e When {u,v,w} = {a,b,c}, we are only free to choose {u,v,w} and this can be done 
in (5) ways so we have (3)p° contributed to E(X7?). 
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e Suppose {u,v, w} and {a,b,c} have two vertices in common. How many ways can this 
happen? We can first choose {u,v,w}. Then choose two of u, v, w to be in {a,b,c} 
and then choose the third vertex in {a,b,c} to be different from u, v, and w. This can 


be done in 
(5) x @ x (n — 3) = 3(n — 3) 6 = 12(‘) 


ways. Multiplying this by p® gives its contribution to E(X7”). 


e The remaining case, one vertex or no vertices in common, can be done in a similar 
fashion. Alternatively, we can simply subtract the above counts from all possible ways 
of choosing {u,v,w} and {a,b,c}. This gives us 


(3) «(G)-G)-2@) 


for the third case. Multiplying this by p® gives its contribution to E(X?). 


Since E(X)? = (3) *®, we have that 


Var(X) = E(X)? — E(X*) = (5) io pr 12(‘) (p® — p°), 


after a bit of algebra using the results in the preceding paragraph. Whew! OJ 


The previous material would be more difficult if we had used the model for random 
graphs that was suggested before Definition 5. Why is this? The model we are using lets 
us ignore possible edges that we don’t care about. The other model does not because we 
must be sure that the total number of edges is correct. 


Exercises for Section 1 


1.1. We are interested in the number of simple graphs with V = n. 


(a) Prove that there are 9(3) (2 to the power (5)) such simple graphs. 


(b) How many of them have exactly q edges? 


1.2. Let (V, E,¢) be a graph and let d(v) be the degree of the vertex v € V. Prove that 
Devev dv) = 2|E|, an even number. Conclude that the number of vertices v for 


which d(v) is odd is even. 
1.3. Let Q = (V, E,¢) be a graph where 
Vv = {A, B, C,D,E, E, G. Hh, E _ {a, b, C, d, e, $50; h,i,j, k} 
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and 
a bc de fgh i ji ik 
g=|A A D E AE BFGCA 
BD EBBGFGCCA 
In this representation of ¢, the first row specifies the edges and the two vertices 
below each edge specify the vertices incident on that edge. Here is a pictorial 
representation P(Q) of this graph. 


a f 


P(Q): [an i a a 
Ora FSO; 


Note that $(k) = {A, A} = {A}. Such an edge is called a loop. (See Example 3.) 
Adding a loop to a vertex increases its degree by two. The vertex H, which does 
not belong to ¢(a) for any edge = (i.e., has no edge incident upon it), is called an 
isolated vertex. The degree of an isolated vertex is zero. Edges, such as a and e 
of Q, with the property that ¢(a) = ¢(e) are called parallel edges. If all edge and 
vertex labels are removed from P(Q) then we get the following picture P’(Q): 


P’'(Q) Ce ee e 


The picture P’(Q) represents the “form” of the graph just described and is some- 
times referred to as a pictorial representation of the “unlabeled” graph associated 
with Q. (See Example 5.) For each of the following graphs R, where R = (V, EF, ¢), 
V = {A, B,C, D, E, F,G, H}, draw a pictorial representation of R by starting with 
P'(Q) removing and/or adding as few edges as possible and then labeling the result- 
ing picture with the edges and vertices of R. A graph R which require no additions 
or removals of edges is said to be “of the same form as” or “isomorphic to” the 
graph Q (Example 5). 


(a) Let 
B= (a, bed, 6, 7,9; 0,4,5,8) 


be the set of edges of R and 


“op. ¢€ @ ee f— B h & 4 & 
o=|C C F A H FE E A DAA 
C G G H H H F H GODF 
(b) Let 
HE = {1,2,3,4, 5,6, 7;8,9, 10,11} 
be the set of edges of R and 
1 2 3 4 5 6 7 8 9 10 11 
o=|A FE FE FE F G HBC DE 
G H E F GH BC ODODE 
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1.4. 


1.5. 


1.6. 


GT-12 


Let Q = (V, E,¢) be the graph where 
V= 1A, BCD. EEG, By, B= 10,0, 6;d, 6, FG, 0,4, 9,6, 0} 


and 
abcde fg hi gj iki it 
¢-1|A ADEAEBFGCAE 
B D E BBGFGC€CCAG 


(a) What is the degree sequence of Q? 


Consider the following unlabeled pictorial representation of Q 


(a) Create a pictorial representation of Q by labeling P’(Q) with the edges and 
vertices of Q. 

(b) A necessary condition that a pictorial representation of a graph R can be 
created by labeling P’(Q) with the vertices and edges of R is that the degree 
sequence of R be (0,2, 2,3,4,4,4,5). True of false? Explain. 

(c) A sufficient condition that a pictorial representation of a graph R can be created 


by labeling P’(Q) with the vertices and edges of R is that the degree sequence 
of R be (0,2,2,3,4,4,4,5). True or false? Explain. 


In each of the following problems information about the degree sequence of a graph 

is given. In each case, decide if a graph satisfying the specified conditions exists or 

not. Give reasons in each case. 

(a) A graph Q with degree sequence (1,1, 2,3,3,5)? 

(b) A graph Q with degree sequence (1,2,2,3,3,5), loops and parallel edges al- 
lowed? 

(c) A graph Q with degree sequence (1, 2,2,3,3,5), no loops but parallel edges 
allowed? 

(d) A graph Q with degree sequence (1,2,2,3,3,5), no loops or parallel edges 


allowed? 

(e) A simple graph Q with degree sequence (3,3, 3,3)? 
(f) A graph Q with degree sequence (3,3,3,3), no loops or parallel edges allowed? 
g) A graph Q with degree sequence (3,3,3,5), no loops or parallel edges allowed? 
) 


( 

(h) A graph Q with degree sequence (4,4,4,4,4), no loops or parallel edges al- 
lowed? 

(i) A graph Q with degree sequence (4,4,4,4,6), no loops or parallel edges al- 
lowed? 


Divide the following graphs into isomorphism equivalence classes and justify your 
answer; i.e., explain why you have the classes that you do. In all cases V = 4. 


Section 2: Digraphs, Paths, and Subgraphs 


Cc d e€ 
= Ge a {1 > {2,3} {3,4} {1,4} {2, i 
abe Or Cc D E F ) 
{1,2} ie Ay {1,4} {1,2} {2,3} {3,4} 
LO ee: 3} a 3} {3.4} (1, a (1, 9) a, on) 


Q R S T U 
(d) = Ce {2,4} {1,3} {3,4} {1,2} (1) 


*1.7. In Example 7, suppose that p is a function of n, say p = p(n). 


(a) Show that the expected number of triangles is behaves like 1 for large n if 
p(n) = 63 /n. 


(b) Suppose the expected number of triangles behaves like 1. How does the ex- 
pected number of edges behave? 


*1.8. Instead of looking for triangles as in Example 7, let’s look for quadrilaterals having 
both diagonals. In other words, we’ll look for sets of four vertices such that all of 
the (5) = 6 possible edges between them are present. 


(a) Show that the expected number of such quadrilaterals is (‘})p®. 


(b) Suppose n is large and p is a function of n so that we expect to see 1 quadri- 
lateral on average. About how many edges do we expect to see? 


(c) Generalize this problem from sets of 4 vertices to sets of k vertices. 


*1.9. Show that the variance of X, the number of triangles in a random graph as com- 
puted in Example 8 satisfies 


var(x) = (5 )p*(( p®) +3(n—3)(1 p?)) <3n( p89), 


Hint: 1-p? <1—-—p? <1. 


Section 2: Digraphs, Paths, and Subgraphs 


In this section we introduce the notion of a directed graph and give precise definitions 
of some very important special substructures of both graphs and directed graphs. 


GT-13 


Basic Concepts in Graph Theory 


Example 9 (Flow of commodities) Look again at Example 2. Imagine now that the 
symbols a, b, c, d, e, f and g, instead of standing for route names, stand for commodities 
(applesauce, bread, computers, etc.) that are produced in one town and shipped to another 
town. In order to get a picture of the flow of commodities, we need to know the directions 
in which they are shipped. This information is provided by picture below: 


In set-theoretic terms, the information needed to construct the above picture can be 
specified by giving a pair D = (V, E,¢) where ¢ is a function. The domain of the function 
@ is E = {a,b,c,d,e, f,g} and the codomain is V x V. Specifically, 


6=( a b Cc d € ci g ) 
~ \(B,A) (A,B) (C,A) (C,B) (B,C) (C,B) (D,B))° 


The structure specified by this information is an example of a directed graph, which we now 
define. OJ 


Definition 6 (Directed graph) A directed graph (or digraph) is a triple D = (V, E, ¢) 
where V and E are finite sets and ¢ is a function with domain E and codomain V x V. 
We call E the set of edges of the digraph D and call V the set of vertices of D. 


Just as with graphs, we can define a notion of a simple digraph. A simple digraph is a pair 
D = (V,E), where V is a set, the vertex set, and E C V x V is the edge set. Just as with 
simple graphs and graphs, simple digraphs are a special case of digraphs in which ¢ is the 
identity function on E; that is, d(e) = e for alle € E. 


There is a correspondence between simple graphs and simple digraphs that is fairly 
common in applications of graph theory. To interpret simple graphs in terms of simple 
digraphs, it is best to consider simple graphs with loops (see Example 3 and Exercises 
for Section 1). Thus consider G = (V,F) where E C P2(V) UPi(V). We can identify 
{u,v} € Po(V) UP(V) with (u,v) € V x V and with (v,u) € V x V. In the case were we 
have a loop, u = v, then we identify {u} with (u,u). Here is a picture of a simple graph 
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and its corresponding digraph: 


Cy / , 


e 
Cc Cc 


(a) (b) 


Each, edge that is not a loop in the simple graph is replaced by two edges “in opposite 
directions” in the corresponding simple digraph. A loop is replaced by a directed loop (e.g., 
{A} is replaced by (A, A)). 


Simple digraphs appear in mathematics under another important guise: binary rela- 
tions. A binary relation on a set V is simply a subset of V x V. Often the name of the 
relation and the subset are the same. Thus we speak of the binary relation E C V x V. If 
you have absorbed all the terminology, you should be able to see immediately that (V, E) 
is a simple digraph and that any simple digraph (V’, E’) corresponds to a binary relation 
BCoV SY", 

Recall that a binary relation R is called symmetric if (u,v) € R implies (v,u) € R. 
Thus simple graphs with loops correspond to symmetric binary relations. 


An equivalence relation on a set S is a particular type of binary relation R C S x S. 
For an equivalence relation, we have (z,y) € R if and only if x and y are equivalent (i.e., 
belong to the same equivalence class or block). Note that this is a symmetric relationship, 
so we may regard the associated simple digraph as a simple graph. Which simple graphs 
(with loops allowed) correspond to equivalence relations? As an example, take S = 7 and 
take the equivalence class partition to be {{1,2,3,4}, {5,6,7}}. Since everything in each 
block is related to everything else, there are (5) = 6 non-loops and (a) = 4 loops associated 
with the block {1,2,3,4} for a total of ten edges. With the block {5,6,7} there are three 
loops and three non-loops for a total of six edges Here is the graph of this equivalence 


relation: 


td G ©) 


A complete simple graph G=(V,E) with loops is a graph with every possible edge. 
That is, EF = P2(V) UP,(V). In the above graph, each block of the equivalence relation is 
replaced by the complete simple graph with loops on that block. This is the general rule. 


A basic method for studying graphs and digraphs is to study substructures of these 
objects and their properties. One of the most important of these substructures is called a 
path. 
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Definition 7 (Path, trail, walk and vertex sequence) Let G = (V,E,¢) bea graph. 


Let €1,€2,...,€n—1 be a sequence of elements of E (edges of G) for which there is a 
sequence @1,42,...,@n of distinct elements of V (vertices of G) such that $(e;) = {a;, aj41} 
fori = 1,2,...,n—1. The sequence of edges €1,€2,...,€n—1 is called a path in G. The 
sequence of vertices a1, 42,...,@n is called the vertex sequence of the path. (Note that since 
the vertices are distinct, so are the edges.) 


If we require that e1,...,€n—1 be distinct, but not that a,,...,@n be distinct, the 
sequence of edges is called a trail. 


If we do not even require that the edges be distinct, it is called a walk. 


If G = (V,E,¢) is a directed graph, then $(e;) = {a;,a;41} is replaced by ¢(e;) = 
(a;,@;41) in the above definition to obtain a directed path, trail, and walk respectively. 


Note that the definition of a path requires that it not intersect itself (i.e., have repeated 
vertices), while a trail may intersect itself. Although a trail may intersect itself, it may not 
have repeated edges, but a walk may. If P = (e1,...,€n—1) is a path in G = (V, E, ¢) with 
vertex sequence @1,...,@, then we say that P is a path from a, to ay. Similarly for a trail 
or a walk. 


In the graph of Example 2, the sequence c¢, d, g is a path with vertex sequence A, C, B, D. 
If the graph is of the form G = (V,E) with EF C P2(V), then the vertex sequence alone 
specifies the sequence of edges and hence the path. Thus, Example 1, the vertex sequence 
MN, SM, SE, TM specifies the path {MN, SM}, {SM, SE}, {SE, TM}. Similarly for di- 
graphs. Consider the graph of Example 9. The edge sequence P = (g,e,c) is a directed 
path with vertex sequence (D, B,C, A). The edge sequence P = (g,e,c,b,a) is a directed 
trail, but not a directed path. The edge sequence P = (d,e,d) is a directed walk, but not 
a directed trail. 


Note that every path is a trail and every trail is a walk, but not conversely. However, 
we can show that, if there is a walk between two vertices, then there is a path. This rather 
obvious result can be useful in proving theorems, so we state it as a theorem. 


Theorem 2 (Walk implies path) Suppose u # v are vertices in the graph G = (V, E, ¢). 
The following are equivalent: 


(a) There is a walk from u to v. 
(b) There is a trail from u to v. 
(c) There is a path from u to v. 


Furthermore, given a walk from u to v, there is a path from u to v all of whose edges are 
in the walk. 


Proof: Since every path is a trail, (c) implies (b). Since every trail is a walk, (b) implies 
(a). Thus it suffices to prove that (a) implies (c). Let e1,¢€2,...,e% be a walk from u to 
v. We use induction on n, the number of repeated vertices in a walk. If the walk has no 
repeated vertices, it is a path. This starts the induction at n = 0. Suppose n > 0. Let 
r be a repeated vertex. Suppose it first appears in edge e; and last appears in edge e;. 
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If r = u, then e;,...,e,% is a walk from u to v in which r is not a repeated vertex. If 
r =v, then e1,...,e; is a walk from u to v in which r is not a repeated vertex. Otherwise, 
€1,.-.,€;4,€;,.-.,€p is a walk from u to v in which r is not a repeated vertex. Hence there 


are less than n repeated vertices in this walk from u to v and so there is a path by induction. 
Since we constructed the path by removing edges from the walk, the last statement in the 
theorem follows. 0 


Note that the theorem and proof are valid if graph is replaced by digraph and walk, trail, 
and path are replaced by directed walk, trail, and path. 


Another basic notion is that of a subgraph of G = (V, E,¢), which we will soon define. 
First we need some terminology about functions. By a restriction ¢' of @ to E’ C E, we 
mean the function ¢/ with domain E’ and satisfying ¢’(x) = ¢(«) for all « € E’. (When 
forming a restriction, we may change the codomain. Of course, the new codomain must 
contain Image(¢’) = ¢(£). In the following definition, the codomain of ¢’ must be P2(V’) 
since G’ is required to be a graph.) 


Definition 8 (Subgraph) Let G = (V,E,¢) bea graph. A graph G’ = (V’, E’,¢’') isa 
subgraph of G if V' CV, E’ C E, and ¢’ is the restriction of ¢ to E’. 


As we have noted, the fact that G’ is itself a graph means that $(x) € P2(V") for each 
x € E” and, in fact, the codomain of ¢’ must be P2(V’). If G is a graph with loops, 
the codomain of ¢’ must be P2(V’) UP, (V’). This definition works equally well if G is a 
digraph. In that case, the codomain of ¢’ must be V x V. 


Example 10 (Subgraph — key information) For the graph G = (V,E,¢) below, 
let G’ = (V’, E’, ¢’) be defined by V’ = {A,B,C}, E’ = {a,b,c, f}, and by ¢’ being the 
restriction of ¢ to E’ with codomain P2(V"). Notice that ¢’ is determined completely from 
knowing V’, E’ and ¢. Thus, to specify a subgraph G’, the key information is V’ and E’. 


As another example from the same graph, we let V’ = V and E’ = {a,b,c, f}. In this 
case, the vertex D is not a member of any edge of the subgraph. Such a vertex is called an 
isolated vertex of G’. (See also Exercises for Section 1.) 


One way of specifying a subgraph is to give a set of edges E’ C E and take V’ to be 
the set of all vertices on some edge of E’. In other words , V’ is the union of the sets 6(x) 
over all x € E’. Such a subgraph is called the subgraph induced by the edge set E’ or the 
edge induced subgraph of E’. The first subgraph of this example is the subgraph induced 
by E’ = {a,b,¢, f}. 


Likewise, given a set V’ C V, we can take E’ to be the set of all edges x € E such 
that ¢(a) C V’. The resulting subgraph is called the subgraph induced by V’ or the vertex 
induced subgraph of V’. Referring to the picture again, the edges of the subgraph induced 


GT-17 


Basic Concepts in Graph Theory 


by V’ = {C, B}, are E’ = {d,e, f}. 


CG D 


Look again at the above graph. In particular, consider the path c,a with vertex sequence 
C,A,B. Notice that the edge d has ¢(d) = {C, B}. The subgraph G’ = (V’, E’, ¢’), where 
V' ={C,A, B} and E” = {c,a,d} is called a cycle of G. In general, whenever there is a path 
in G, say €1,...,@n—1 with vertex sequence a1,...,@,, and an edge x with ¢(x) = {a , an}, 
then the subgraph induced by the edges e€1,...,@n—1, x is called a cycle of G. Parallel edges 
like a and b in the preceding figure induce a cycle. A loop also induces a cycle. J 


The formal definition of a cycle is: 


Definition 9 (Circuit and Cycle) Let G = (V,E,¢) be a graph and let e€1,...,en be 
a trail with vertex sequence @1,...,@n,a,. (It returns to its starting point.) The subgraph 
G’ of G induced by the set of edges {e1,...,@€n} is called a circuit of G. The length of the 
circuit is n. 


e If the only repeated vertices on the trail are a, (the start and end), then the circuit is 
called a simple circuit or cycle. 


e If “trail” is replaced by directed trail, we obtain a directed circuit and a directed cycle. 


In our definitions, a path is a sequence of edges but a cycle is a subgraph of G. In actual 
practice, people often think of a cycle as a path, except that it starts and ends at the same 
vertex. This sloppiness rarely causes trouble, but can lead to problems in formal proofs. 
Cycles are closely related to the existence of multiple paths between vertices: 


Theorem 3 (Cycles and multiple paths) | Two vertices u 4 v are on a cycle of G if 
and only if there are at least two paths from u to v that have no vertices in common except 
the endpoints u and v. 


Proof: Suppose u and v are on a cycle. Follow the cycle from u to v to obtain one path. 
Then follow the cycle from v to u to obtain another. Since a cycle has no repeated vertices, 
the only vertices that lie in both paths are u and v. On the other hand, a path from u to 
v followed by a path from v to u is a cycle if the paths have no vertices in common other 
thanuandv. QJ 


GT-18 


Section 2: Digraphs, Paths, and Subgraphs 


One important feature of a graph is whether or not any pair of vertices can be connected 
by a path. You can probably imagine, without much difficulty, applications of graph theory 
where this sort of “connectivity” is important. Not the least of such examples would be 
communication networks. Here is a formal definition of connected graphs. 


Definition 10 (Connected graph) Let G = (V,E,¢) be a graph. If for any two 
distinct elements u and v of V there is a path P from u to v then G is a connected graph. 
If |V| =1, then G is connected. 


We make two observations about the definition. 


e Because of Theorem 2, we can replace “path” in the definition by “walk” or “trail” if 
we wish. (This observation is used in the next example.) 


e The last sentence in the definition is not really needed. To see this, suppose |V| = 1. 
Now G is connected if, for any two distinct elements u and v of V there is a path from 
u to v. This is trivially satisfied since we cannot find two distinct elements in the one 
element set V. 


The graph of Example 1 has two distinct “pieces.” It is not a connected graph. There 
is, for example, no path from u = TM tov = C'S. Note that one piece of this graph consists 
of the vertex induced subgraph of the vertex set {CS,EN,SH, RL} and the other piece 
consists of the vertex induced subgraph of {TM,SE,MN,SM}. These pieces are called 
connected components of the graph. This is the case in general for a graph G = (V, E, @): 
The vertex set is partitioned into subsets Vj, V2,...,Vm such that if u and v are in the 
same subset then there is a path from wu to v and if they are in different subsets there is no 
such path. The subgraphs G; = (V1, £1, ¢1),..-;Gm = (Vm; Em, $m) induced by the sets 
Vi,...,Vm are called the connected components of G. Every edge of G appears in one of 
the connected components. To see this, suppose that {u,v} is an edge and note that the 
edge is a path from u to v and so u and v are in the same induced subgraph, G;. By the 
definition of induced subgraph, {u,v} is in G;. 


Example 11 (Connected components as an equivalence relation) You may have 
noticed that the “definition” that we have given of connected components is a bit sloppy: 
We need to know that the partitioning into such subsets can actually occur. To see that this 
is not trivially obvious, define two integers to be “connected” if they have a common factor. 
Thus 2 and 6 are connected and 3 and 6 are connected, but 2 and 3 are not connected and 
so we cannot partition the set V = {2,3,6} into “connected components”. We must use 
some property of the definition of graphs and paths to show that the partitioning of vertices 
is possible. One way to do this is to construct an equivalence relation. 


For u,v € V, write u ~ v if and only if either u = v or there is a walk from u to 
v. It is clear that ~ is reflexive and symmetric. We now prove that it is transitive. Let 
u~vu~w. The walk from u to v followed by the walk from v to w is a walk from u to 
w. This completes the proof that u ~ v is an equivalence relation. The relation partitions 
V into subsets Vj,...,Vm. By Theorem 2, the vertex induced subgraphs of the V; satisfy 
Definition 10. O 


When talking about connectivity, graphs and digraphs are different. In a digraph, the 
fact that there is a directed walk from u to v does not, in general, imply that there is a 
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directed walk from v to u. Thus, the “directed walk relation”, unlike the “walk relation” 
is not symmetric. This complicates the theory of connectivity for digraphs. 


Example 12 (Eulerian graphs) We are going to describe a process for constructing a 
graph G = (V, E,¢) (with loops allowed). Start with V = {v;} consisting of a single vertex 
and with EF = §. Add an edge e1, with ¢(e1) = {v1, ve}, to FE. If vu; = v2, we have a graph 
with one vertex and one edge (a loop), else we have a graph with two vertices and one 
edge. Keep track of the vertices and edges in the order added. Here (v1, v2) is the sequence 
of vertices in the order added and (e;) is the sequence of edges in order added. Suppose 
we continue this process to construct a sequence of vertices (not necessarily distinct) and 
sequence of distinct edges. At the point where k distinct edges have been added, if v is 
the last vertex added, then we add a new edge e;41, different from all previous edges, with 
O(€x41) = {v,v’} where either v’ is a vertex already added or a new vertex. Here is a 
picture of this process carried out with the edges numbered in the order added 


where the vertex sequence is 
a = (0,4, 6,@,d,a,.), fg, €¢,¢,9)- 


Such a graph is called a graph with an Eulerian trail. The edges, in the order added, are 
the Eulerian trail and S' is the vertex sequence of the trail 


By construction, if G is a graph with an Eulerian trail, then there is a trail in G that 
includes every edge in G. If there is a circuit in G that includes every edge of G then G 
is called an Eulerian circuit graph or graph with an Eulerian circuit. Thinking about the 
above example, if a graph has an Eulerian trail but no Eulerian circuit, then all vertices of 
the graph have even degree except the start vertex (a in our example with degree 5) and 
end vertex (g in our example with degree 3). If a graph has an Eulerian circuit then all 
vertices have even degree. The converses in each case are also true (but take a little work 
to show): If G is a connected graph in which every vertex has even degree then G has an 
Eulerian circuit. If G is a connected graph with all vertices but two of even degree, then G 
has an Eulerian trail joining the two vertices of odd degree. O 


Here is a precise definition of Eulerian trail and circuit. 


Definition 11 (Eulerian trail, circuit) Let G = (V,E,¢) be a connected graph. If 
there is a trail with edge sequence (€1,€2,...,€,) in G which uses each edge in E, then 
(€1,€2,---,¢€x) is called an Eulerian trail. If there is a circuit C = (V’, E’,¢') in G with 
E’ = E, then C is called an Eulerian circuit. 


The ideas of a directed Eulerian circuit and directed Eulerian trail for directed graphs are 
defined in exactly the same manner. 
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An Eulerian circuit in a graph contains every edge of that graph. What about a cycle 
that contains every vertex but not necessarily every edge? Our next example discusses that 
issue. 


Example 13 (Hamiltonian cycle) Start with a graph G’ = (V,E’,¢’) that is a cycle 
and then add additional edges, without adding any new vertices, to obtain a graph G = 
(V, E,¢). As an example, consider 


i 


where the first graph G’ = (V, E’, ¢’) is the cycle induced by the edges {a, b,c, d,e, f}. The 
second graph G = (V, E, ¢) is obtained from G’ by adding edges g,h,i and 7. A graph that 
can be constructed from such a two-step process is called a Hamiltonian graph. The cycle 
G’ is called a Hamiltonian cycle for G. 


Definition 12 (Hamiltonian cycle, Hamiltonian graph) A cycle in a graph G = 
(V, E,¢) is a Hamiltonian cycle for G if every element of V is a vertex of the cycle. A 
graph G = (V,E,¢) is Hamiltonian if it has a subgraph that is a Hamiltonian cycle for G. 


Notice that an Eulerian circuit uses every edge exactly once and a Hamiltonian cycle uses 
every vertex exactly once. We gave a very simple characterization of when a graph has 
an Eulerian circuit (in terms of degrees of vertices). There is no simple characterization 
of when a graph has a Hamiltonian cycle. On the contrary, the issue of whether or not a 
graph has a Hamiltonian cycle is notoriously difficult to resolve in general. 


As we already mentioned, connectivity issues in digraphs are much more difficult than 
in graphs. A digraph is strongly connected if, for every two vertices v and w there is a 
directed path from v to w. From any digraph D, we can construct a simple graph S(D) on 
the same set of vertices by letting {v,w} be an edge of S(D) if and only if at least one of 
(u,v) and (v, u) is an edge of D. You should be able to show that if D is strongly connected 
then S(D) is connected. The converse is false. As an example, take D = (V, F) to be the 
simple digraph where V = {1,2} and FE = {(1,2)}. There is no directed path from 2 to 1, 
but clearly S(D) = (V, {{1,2}}) is connected. 


Other issues for digraphs analogous to those for graphs work out pretty well, but are 
more technical. An example is the notion of degree for vertices. For any subset U of the 
vertices V of a directed graph D = (V,E), define di,(U) to be the number of edges of e 
of D with ¢(e) of the form (w,u) where u € U and w ¢ U. Define dous(U) similarly. If 
U = {v} consists of just one vertex, din(U) is usually written simply as d;,(v) rather than 
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the more technically correct din({v}). Similarly, we write dou:(v). You should compute 
din(v) and dout(v) for the vertices v of the graph of Example 9. You should be able to show 
that 5) din(v) = > dout(v) = |E|, where the sums range over all v € V. See the Exercises 
for Section 1 for the idea. 


Example 14 (Bicomponents of graphs) Let G = (V,E,¢) bea graph. For e, f € E 
write e ~ f if either e = f or there is a cycle of G that contains both e and f. We claim 
that this is an equivalence relation. The reflexive and symmetric parts are easy. Suppose 
thate ~ f ~ g. If e = g, then e ~ g, so suppose that e 4 g. Let ¢(e) = {v1, v2}. Let 
C(e, f) be the cycle containing e and f and C(f,g) the cycle containing f and g. In C(e, f) 
there is a path P, from v; to vg that does not contain e. Let x and y # =x be the first 
and last vertices on P; that lie on the cycle containing f and g. We know that there must 
be such points because the edge f is on P;. Let P, be the path in C(e,f) from y to x 
containing e. In C(f,g) there is a path P3 from x to y containing g. We claim that P, 
followed by P3 defines a cycle containing e and g. 


Some examples may help. Consider a graph that consists of two disjoint cycles that 
are joined by an edge. There are three bicomponents — each cycle and the edge joining 
them. Now consider three cycles that are disjoint except for one vertex that belongs to all 
three of them. Again there are three bicomponents — each of the cycles. 


Since ~ is an equivalence relation on the edges of G, it partitions them. If the partition 
has only one block, then we say that G is a biconnected graph. If E’ is a block in the 
partition, the subgraph of G induced by E’ is called a bicomponent of G. Note that the 
bicomponents of G are not necessarily disjoint: Bicomponents may have vertices in common 
(but never edges). There are four bicomponents in the following graph. Two are the cycles, 
one is the edge {C,O}, and the fourth consists of all of the rest of the edges. 


Exercises for Section 2 


2.1. A graph G = (V,E) is called bipartite if V can be partitioned into two sets C 
and S such that each edge has one vertex in C' and one vertex in S. As a specific 
example, let C be the set of courses at the university and S' the set of students. 
Let V = CUS and let {s,c} € E if and only if student s is enrolled in course c. 


(a) Prove that G = (V, FE) is a simple graph. 
(b) Prove that every cycle of G has an even number of edges. 


2.2. In each of the following graphs, find the longest trail (most edges) and longest 
circuit. If the graph has an Eulerian circuit or trail, say so. 
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2.3. For each of the following graphs G = (V, E, ¢), find a cycle in G of maximum length. 
State whether or not the graph is Hamiltonian. 


2.4. We are interested in the number of simple digraphs with V =n 
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(a) Find the number of them. 
(b) Find the number of them with no loops. 
(c) In both cases, find the number of them with exactly q edges. 


2.5. An oriented simple graph is a simple graph which has been converted to a digraph 
by assigning an orientation to each edge. The orientation of {u,v} can be thought 
of as a mapping of it to either (u,v) or (v,u). 


(a) Give an example of a simple digraph that has no loops but is not an oriented 
simple graph 


(b) Find the number of oriented simple digraphs. 
(c) Find the number of them with exactly q edges. 


2.6. A binary relation R on S is an order relation if it is reflexive, antisymmetric, and 
transitive. R is antisymmetric if for all (x,y) € Rwith« #y, (y,x) ¢ R. Given an 
order relation R, the covering relation H of R consists of all (x, z) € R, x 4 z, such 
that there is no y, distinct from both x and z, such that (x,y) € R and (y,z) € R. 
A pictorial representation of the covering relation as a directed graph is called a 
“Hasse diagram” of H. 


(a) Show that the divides relation on 
S = {2,3,4,5,6, 7,8, 9, 10, 11, 12, 13, 14, 15, 16} 


is an order relation. By definition, (x,y) is in the divides relation on S$ is x 
is a factor of y. Thus, (4,12) is in the divides relation. x|y is the standard 
notation for x is a factor of y. 


(b) Find and draw a picture of the directed graph of the covering relation of the 
divides relation. 
Hint: You must find all pairs (2, z) € S x S such that x|z but there does not 
exist any y, x < y < z, such that zly and y|z. 


Section 3: Trees 


Trees play an important role in a variety of algorithms. We have used decision trees 
to enhance our understanding of recursion. In this section, we define trees precisely and 
look at some of their properties. 


Definition 13 (Tree) If G is a connected graph without any cycles then G is called a 
tree. (If |\V| = 1, then G is connected and hence is a tree.) A tree is also called a free tree. 


The graph of Example 2 is connected but is not a tree. It has many cycles, including 
({A, B,C}, {a, e,c}). The subgraph of this graph induced by the edges {a, e, g} is a tree. If 
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G is a tree, then ¢ is an injection since if e; # e2 and ¢(e1) = (eg), then {e1, e2} induces 
a cycle. In other words, any graph with parallel edges is not as tree. Likewise, a loop is a 
cycle, so a tree has no loops. Thus, we can think of a tree as a simple graph when we are 
not interested in names of the edges. 


Since the notion of a tree is so important, it will be useful to have some equivalent 
definitions of a tree. We state them as a theorem 


Theorem 4 (Alternative definitions of a tree) If G is a connected graph, the 
following are equivalent. 


(a) G is a tree. 

(b) G has no cycles. 

(c) For every pair of vertices u # v in G, there is exactly one path from u to v. 
(d) Removing any edge from G gives a graph which is not connected. 


(e) The number of vertices of G is one more than the number of edges of G. 


Proof: We are given that G is connected, thus, by the definition of a tree, (a) and (b) 
are equivalent. 


Theorem 3 can be used to prove that (b) implies (c). We leave that as an exercise 
(show not (c) implies not (b)). 


If {u,v} is an edge, it follows from (c) that the edge is the only path from wu to v and 
so removing it disconnects the graph. Hence (c) implies (d). 


We leave it as an exercise to prove that (d) implies (b) (show not (b) implies not (d)). 


Thus far, we have shown (a) and (b) are equivalent, and we have shown that (b) implies 
(c) implies (d) implies (b), so (a), (b), (c), and (d) are all equivalent. All that remains is 
to include (e) in this equivalence class of statements. Do do this, all we have to do is show 
that (e) implies any of the equivalent statements (a), (b), (c), and (d) and, conversely, some 
one of (a), (b), (c), and (d) implies (e). We shall show that (b) implies (e) and that (e) 
implies (a). 


We first show that (b) implies (e). We will use induction on the number of vertices of 
G. If G has one vertex, it has no edges and (e) is satisfied. Otherwise, we claim that G has 
a vertex u of degree 1; that is, it lies on only one edge {u,w}. We prove this claim shortly. 
Remove u and {u,v} to obtain a graph H with one less edge and one less vertex. Since G 
is connected and has no cycles, the same is true of H. By the induction hypothesis, H has 
one less edge than vertex. Since we got from G to H by removing one vertex and one edge, 
G must also have one less edge than vertex. By induction, the proof is done. It remains to 
prove the existence of u. Suppose no such u exists; that is, suppose that each vertex lies 
on at least two edges. We will derive a contradiction. Start at any vertex v, of G leave v; 
by some edge e; to reach another vertex vg. Leave vg by some edge e2 different from the 
edge used to reach vg. Continue with this process. Since each vertex lies on at least two 
edges, the process never stops. Hence we eventually repeat a vertex, say 


U1, €1,U2,-+-+5Uk,€ky-++5 Un; En; Un+1 = Vk: 
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The edges ez,...,€n form a cycle, which is a contradiction. 


Having shown that (b) implies (e), we now show that (e) implies (a). We use the 
contrapositive and show that not (a) implies not (e). Thus we assume G is not a tree. 
Hence, by (d) we can remove an edge from G to get a new graph which is still connected. 
If this is not a tree, repeat the process and keep doing so until we reach a tree T’. For a 
tree T’, we trivially satisfy (a) which implies (b) and (b) implies (e). Thus, the number of 
vertices is now one more than the number of edges in the graph 7. Since, in going from G 
to T’, we removed edges from G but did not remove vertices, G must have at least as many 
edges as vertices. This shows not (a) implies not (e) and completes the proof. 0 


Definition 14 (Forest) A forest is a graph all of whose connected components are trees. 
In particular, a forest with one component is a tree. (Connected components were defined 
following Definition 10.) 


Example 15 (A relation for forests) Suppose a forest has v vertices, e edges and c 
(connected) components. What values are possible for the triple of numbers (v,e,c)? It 
might seem at first that almost anything is possible, but this is not so. In fact v —c =e 
because of Theorem 4(e). Why? Let the forest consist of trees T,,...,7~ and let the triples 
for T; be (v;, e;,c;). Since a tree is connected, c; = 1. By the theorem, e; = v; — 1. Since 
V=Uip +++: +, ande =e, +...+€,. we have 


e = (vy, —1)+(vo-1)4+--- +(e -1) = (v1 +--+ +¥-) —C = Ue. 


Suppose a forest has e = 12 and v = 15. We know immediately that it must be made 
up of three trees because c = v —-e = 15 — 12. 


Suppose we know that a graph G = (V, E,¢) has v = 15 and c = 3, what is the fewest 
edges it could have? For each component of G, we can remove edges one by one until we 
cannot remove any more without breaking the component into two components. At this 
point, we are left with each component a tree. Thus we are left with a forest of c = 3 trees 
that still has v = 15 vertices. By our relation v — c = e, this forest has 12 edges. Since we 
may have removed edges from the original graph to get to this forest, the original graph 
has at least 12 edges. 


What is the maximum number of edges that a graph G = (V,E,¢) with v = 15 and 
c = 3 could have? Since we allow multiple edges, a graph could have an arbitrarily large 
number of edges for a fixed v and c — if e is an edge with ¢(e) = {u,v}, add in as many 
edges e; with ¢(e;) = {u,v} as you wish. Hence we will have to insist that G be a simple 
graph. 


What is the maximum number of edges that a simple graph G with v= 15 andc = 3 
could have? This is a bit trickier. Let’s start with a graph where c is not specified. The 
edges in a simple graph are a subset of P2(V) and since P2(V) has (3) elements, a simple 
graph with v vertices has at most (5) edges. 


Now let’s return to the case when we know there must be three components in our 
simple graph. Suppose the number of vertices in the components are v1, vg and v3. Since 
there are no edges between components, we can look at each component by itself. Using 
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the result in the previous paragraph for each component, the maximum number of possible 
edges is CS) + (S) ++ (2). We don’t know v1, v2, v3. All we know is that they are strictly 
positive integers that sum to v. It turns out that the maximum occurs when one of 1; is 
as large as possible and the others equal 1, but the proof is beyond this course. Thus the 
answer is (aly which in our case is (2) = 78. In general, if there were c components, 
c — 1 components would have one vertex each and the remaining component would have 


v —(c—1) =v+1-—c vertices. Hence there can be no more than re) edges. 


Reviewing what we’ve done, we see: 
e There is no graph G = (V,E,¢) with v-—c>e. 


e If v—c =e, the graph is a forest of c trees and any such forest will do as an 
example. 


e If v—c<_e, there are many examples, none of which are forests. 


e Ifv—c<e and we have a simple graph, then we must have e < Gangs oO 


Recall that decision trees, as we have used them, have some special properties. First, 
they have a starting point. Second, the edges (decisions) out of each vertex are ordered. 
We now formalize these concepts. 


Definition 15 (Rooted graph) A pair (G,v), consisting of a graph G = (V, E,@) and 
a specified vertex v, is called a rooted graph with root v. 


Definition 16 (Parent, child, sibling and leaf) Let (T,r) be a rooted tree. If w 
is any vertex other than r, let r = v9,U1,...,Uk,Ug+1 = w, be the list of vertices on the 
unique path from r to w. We call vz, the parent of w and call w a child of v,. Parents 
and children are also called fathers and sons. Vertices with the same parent are siblings. 
A vertex with no children is a leaf. All other vertices are internal vertices of the tree. 


Definition 17 (Rooted plane tree) Let (T,r) be a rooted tree. For each vertex, order 
the children of the vertex. The result is a rooted plane tree, which we abbreviate to RP- 
tree. RP-trees are also called ordered trees. An RP-tree is also called, in certain contexts, 
a decision tree, and, when there is no chance of misunderstanding, simply a tree. 


Since almost all trees in computer science are rooted and plane, computer scientists usually 
call a rooted plane tree simply a tree. It’s important to know what people mean! 


Example 16 (A rooted plane tree) Below is a picture of a rooted plane tree T = 
(V, E,¢). In this case V = 11 and E = {a,...,j}. There are no parallel edges or loops, 
as required by the definition of a RP-tree. The root is r = 1. For each vertex, there is a 
unique path from the root to that vertex. Since ¢ is an injection, once ¢@ has been defined 
(as it is in the picture), that unique path can be specified by the vertex sequence alone. 
Thus, the path from the root to 6 is (1,3,6). The path from the root to 9 is (1,3,6,9). 
Sometimes computer scientists refer to the path from the root to a vertex v as the “stack” 
of v. 
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In the tree below, the vertex 6 is the parent of the vertex 9. The vertices 8, 9, 10, and 
11 are the children of 6 and, they are siblings of each other. The leaves of the tree are 4, 5, 
7, 8, 9, 10, and 11. All other vertices (including the root) are internal vertices of the tree. 


Remember, an RP-tree is a tree with added properties. Therefore, it must satisfy (a) 
through (e) of Theorem 4. In particular, T has no cycles. Also, there is a unique path 
between any two vertices (e.g., the path from 5 to 8 is (5,2,1,3,6,8)). Removing any edge 
gives a graph which is not connected (e.g., removing j disconnects T into a tree with 10 
vertices and a tree with 1 vertex; removing e disconnects T' into a tree with 6 vertices and 
one with 5 vertices). Finally, the number of edges (10) is one less than the number of 
vertices. 


1 


Example 17 (Traversing a rooted plane tree) Just as in the case of decision trees, 
one can define the notion of depth first traversals of a RP-tree. 
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Imagine going around (“traversing”) the above RP-tree following arrows. Start at the root, 
1, go down edge a to vertex 2, etc. Here is the sequence of vertices as encountered in this 
process: 1, 2, 4, 2, 5, 2, 1, 3, 6, 8, 6, 9, 6, 10, 6, 11, 6, 3, 7, 3, 1. This sequence of vertices 
is called the depth first vertex sequence, DF V(T), of the RP-tree T. The number of times 
each vertex appears in DFV(T) is one plus the number of children of that vertex. For 
edges, the corresponding sequence is a, c, c, d, d, a, b, e, g, g, h, h, i, 1, 9, j, e, f, f, 6. This 
sequence is the depth first edge sequence, DFE(T), of the tree. Every edge appears exactly 
twice in DFE(T). If the vertices of the RP-tree are read left to right, top to bottom, we 
obtain the sequence 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11. This is called the breadth first vertex 
sequence, BFV(T). Similarly, the breadth first edge sequence, BFE(T), is a, b, c, d, e, f, 9, 
h, 4, j. 

The sequences BFV(T) and BFE(Z) are linear orderings of the vertices and edges 
of the RP-tree T (i.e., each vertex or edge appears exactly once in the sequence). We 
also associate linear orderings with DF V(T) called the preorder sequence of vertices of T, 
PREV(T), and the postorder sequence of vertices of T, POSV(T). 


PREV(T) = 1,2,4,5,3,6,8,9,10,11,7 is the sequence of first occurrences of the ver- 
tices of T in DFV(T). 


POSV(T) = 4,5, 2,8,9, 10, 11, 6,7, 3, 1 is the sequence of last occurrences of the vertices 
of T in DFV(T). 


Notice that the order in which the leaves of T appear, 4, 5, 8, 9, 10, 11, is the same in 
both PREV(T) and POSV(T). Can you see why this is always true for any tree? 


*Example 18 (The number of labeled trees) How many n-vertex labeled trees are 
there? In other words, count the number of trees with vertex set V = n. The answer has 
been obtained in a variety of ways. We will do it by establishing a correspondence between 
trees and functions by using digraphs. 


Suppose f is a function from V to V. We can represent this as a simple digraph (V, E) 
where the edges are {(v, f(v)) | v € V}. The function 


corresponds to the directed graph 


\ 2 
ee CD 
C>} Zz. 


Such graphs are called functional digraphs. You should be able to convince yourself that 
a functional digraph consists of cycles (including loops) with each vertex on a cycle being 
the root of a tree of noncyclic edges. The edges of the trees are directed toward the roots. 
In the previous figure, 


7 
5 
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e 1 is the root of the tree with vertex set {1,3,9}, 

e 2 is the root of the tree with vertex set {2,4,7}, 

e 5 is the root of the tree with vertex set {5}, 

e 6 is the root of the tree with vertex set {6}, 

e 8 is the root of the tree with vertex set {8}, 

e 10 is the root of the tree with vertex set {10} and 
e 11 is the root of the tree with vertex set {11}. 


In a tree, there is a unique path from the vertex 1 to the vertex n. Remove all the 
edges on the path and list the vertices on the path, excluding 1 and n, in the order they 
are encountered. Interpret this list as a permutation in 1 line form. Draw the functional 
digraph for the cycle form, adding the cycles (1) and (n). Add the trees that are attached 
to each of the cycle vertices, directing their edges toward the cycle vertices. Consider the 
following figure. 


The one line form is 10,8,2,5,6. In two line form it is é : ; : $6): Thus the 
cycle form is (2,10,6)(5,8). When we add the two cycles (1) and (11) to this, draw the 
directed graph, and attach the directed trees, we obtain the functional digraph pictured 


earlier. 


We leave it to you to convince yourself that this gives us a one-to-one correspondence 
between trees with V = n and functions f :n — n with f(1) = 1 and f(n) =n. In creating 
such a function, there are n choices for each of f(2),...,f(n —1). Thus there are n”~? 
such functions and hence n”~? trees. OJ 


Spanning Trees 


Trees are not only important objects of study per se, but are important as special subgraphs 
of general graphs. A spanning tree is one such subgraph. For notational simplicity, we shall 
restrict ourselves to simple graphs, G = (V, £), in the following discussion. The ideas we 
discuss extend easily to graphs G = (V, E,¢), even allowing loops. 


Definition 18 (Spanning tree) A spanning tree of a (simple) graph G = (V,E) isa 
subgraph T = (V, E’) which is a tree and has the same set of vertices as G. 
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Example 19 (Connected graphs and spanning trees) Since a tree is connected, 
a graph with a spanning tree must be connected. On the other hand, it is not hard to 
see that every connected graph has a spanning tree. Any simple graph G = (V, F) has a 
subgraph that is a tree, T’ = (V’, E’). Take V’ = {v} to be one vertex and E” empty. 
Suppose that T’ = (V’, E’) is the largest such “subtree.” If J’ is not a spanning tree then 
there is a vertex w of G that is not a vertex of T’. If G is connected, choose a vertex u in 
T’' and a path w = 21,22,...,e, = u from w to u. Let 7, 1 <7 <-k, be the first integer 
such that x; is a vertex of T’. Then adding the edge {x;_1,x,;} and the vertex x;_; to T’ 
creates a subtree T' of G that is larger than T’, a contradiction of the maximality of T’. 
We have, in fact, shown that a graph is connected if and only if every maximal subtree is a 
spanning tree. Thus we have: A graph is connected if and only if it has a spanning tree. It 
follows that, if we had an algorithm that was guaranteed to find a spanning tree whenever 
such a tree exists, then this algorithm could be used to decide if a graph is connected. 


Example 20 (Minimum spanning trees) Suppose we wish to install “lines” to link 
various sites together. A site may be a computer installation, a town, or a factory. A line 
may be a digital communication channel, a rail line or, a shipping route for supplies. We’ll 
assume that 


(a) a line operates in both directions; 
(b) it must be possible to get from any site to any other site using lines; 


(c) each possible line has a cost (rental rate, construction cost, or shipping cost) 
independent of each other line’s cost; 


(d) we want to choose lines to minimize the total cost. 


We can think of the sites as vertices V in a (simple) graph, the possible lines as edges E 
and the costs as a function ’ from the edges to the positive real numbers. Because of (a) 
and (b), the lines E’ C E we actually choose will be such that T = (V, E’) is connected. 
Because of (d), T will be a spanning tree since, if it had more edges, we could delete some, 
but if we delete any from a tree it will not be connected by Theorem 4. QJ 


We now formalize these ideas in a definition: 


Definition 19 (Weights in a graph) Let G = (V,E) bea simple graph and let be 
a function from E to the positive real numbers. We call X(e) the weight of the edge e. If 
H = (V’,E’) is a subgraph of G, then \(H), the weight of H, is the sum of X(e’) over all 
e € E’. 

A minimum weight spanning tree for a connected graph G is a spanning tree such that 
A(T) < A(T’) whenever T’ is another spanning tree. 


How can we find a minimum weight spanning tree 7’? One approach is to construct 
T by adding an edge at a time in a greedy way. Since we want to minimize the weight, 
“oreedy” means keeping the weight of each edge we add as low as possible. Here’s such an 
algorithm. 


Theorem 5 (Minimum weight spanning tree: Prim’s algorithm) Let G = (V, E) 
be a simple graph with edge weights given by 4. If the algorithm stops with V'’ 4 V,G 
has no spanning tree; otherwise, (V, E’) is a minimum weight spanning tree for G. 


GT-31 


Basic Concepts in Graph Theory 


1. Start: Let E’ = 0 and let V’ = {vo} where vo is any vertex in V. 


2. Possible Edges: Let F C E be those edges f = {x,y} with one vertex in V’ and 
one vertex not in V’. If F = 0, stop. 


3. Choose Edge Greedily: Let f = {x,y} be such that \(f) is a minimum over 
all f € F. Replace V’ with V' U {x,y} and E"’ with E’ U{f}. Go to Step 2. 


Proof: We begin with the first part; i.e, if the algorithm stops with V’ 4 V, then G 
has no spanning tree. The argument is similar to that used in Example 19. Suppose that 
V'#V and that there is a spanning tree. We will prove that the algorithm does not stop 
at V’. Choose u € V —V’ and v € V’. Since G is connected, there must be a path from u 
to v. Each vertex on the path is either in V’ or not. Since u ¢ V’ and v € V’, there must 
be an edge f on the path with one end in V’ and one end not in V’. But then f € F and 
so the algorithm does not stop at V’. 


We now prove that, if G has a spanning tree, then (V, E’) is a minimum weight spanning 
tree. One way to do this is by induction: We will prove that at each step there is a minimum 
weight spanning tree of G that contains E’. 


The starting case for the induction is the first step in the algorithm; i.e., E’ = 0. Since 
G has a spanning tree, it must have a minimum weight spanning tree. The edges of this 
tree obviously contain the empty set, which is what E’ equals at the start. 


We now carry out the inductive step of the proof. Let V’ and E’ be the values going 
into Step 3 and let f = {x,y} be the edge chosen there. By the induction hypothesis, there 
is a minimum weight spanning tree T of G that contains the edges E’. If it also contains 
the edge f, we are done. Suppose it does not contain f. We will prove that we can replace 
an edge in the minimum weight tree with f and still achieve minimum weight. 


Since T contains all the vertices of G, it contains x and y and, also, some path P from 
x to y. Suppose x € V’ and y ¢ V’, this path must contain an edge e = {u,v} with u € V’ 
and v ¢ V’. We now prove that removing e from TJ and then adding f to T’ will still give 
a minimum spanning tree. 


By the definition of F in Step 2, e € F and so, by the definition of f, A(e) > A(f). 
Thus the weight of the tree does not increase. If we show that the result is still a tree, this 
will complete the proof. 


The path P together with the edge f forms a cycle in G. Removing e from P and 
adding f still allows us to reach every vertex in P and so the altered tree is still connected. 
It is also still a tree because it contains no cycles — adding f created only one cycle and 
removing e destroyed it. This completes the proof that the algorithm is correct. J 


The algorithm for finding a minimum weight spanning tree that we have just proved 
is sometimes referred to as Prim’s Algorithm. A variation on this algorithm, proved in 
a similar manner, is called Kruskal’s algorithm. In Kruskal’s algorithm, step 2 of Prim’s 
algorithm is changed to 


2'. Possible Edges: Let F C E be those edges f = {x,y} where x and y do not 
belong to the same component of (V, E’). If F = 0, stop. 


Intuitively, f ¢ F if f forms a cycle with any collection of edges from E’. Otherwise, f € F’. 
This extra freedom is sometimes convenient. Our next example gives much less freedom in 
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choosing new edges to add to the spanning tree, but produces a type of spanning tree that 
is useful in many algorithms applicable to computer science. 


Example 21 (Algorithm for lineal or depth-first spanning trees) We start with 
a rooted simple graph G = (V, F) with vp as root. The algorithmic process constructs a 
spanning tree rooted at ug. It follows the same general form as Theorem 5. The weights, 
if there, are ignored. 


1. Start: Let E’ = @ and let V’ = {vo} where v9 is the root of G. Let T’ = (V’, E’) 
be the starting subtree, rooted at vo. 


2. Possible New Edge: Let v be the last vertex added to V’ where 7” = (V’, E’) 
is the subtree thus far constructed, with root vp. Let x be the first vertex on the 
unique path from v to vp for which there is an edge f = {x,y} with x € V’ and 
y €V’. If there is no such 2, stop. 


3. Add Edge: Replace V’ with V’U{y} and E’ with E’U{f} to obtain T’ = (V’, E’) 
as the new subtree thus far constructed, with root vp. (Note: y is now the last 
vertex added to V’.) Go to Step 2. 


Here is an example. We are going to find a lineal spanning tree for the graph below, 
root a. The result is shown on the right where the original vertices have been replaced by 
the order in which they have been added to the “tree thus far constructed” in the algorithm. 


When there is a choice, we choose the left or upward vertex. For example, at the start, 
when 8, d, e and g are all allowed, we choose b. When vertex 2 was added, the path to the 
root was (2,1,0). We went along this path towards the root and found that at 1, a new 
edge to 3 could be added. Now the path to the root became (3, 1,0) and we had to go all 
of the way to 0 to add a new edge (the edge to 4). You should go through the rest of the 
algorithm. Although there are some choices, the basic rule of step 2 of the algorithm must 
always be followed. 


There are two extremely important properties that this algorithm has 


1. When the rooted spanning tree T for G has been constructed, there may be edges 
of G not in the spanning tree. In the above picture, there are three such edges, 
indicated by dashed lines. If {x,y} is such an edge, then either x lies on the path 
from y to the root or the other way around. For example, the edge {4,7} in the 
example has 4 on the path from 7 to the root 0. This is the “lineal” property from 
which the spanning trees of this class get their name. 
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2. If, when the rooted spanning tree T has been constructed, the vertices of T’ are 
labeled in the order added by the algorithm AND the children of each vertex of 
T are ordered by the same numbering, then an RP-tree is the result. For this RP 
tree, the numbers on the vertices correspond to preorder, PREV(T), of vertices 
on this tree (starting with the root having value 0). Check this out for the above 
example. 


We will not prove that the algorithm we have presented has properties 1 and 2. We 
leave it to you to study the example, construct other examples, and come to an intuitive 
understanding of these properties. [J 


Property 1 in the preceding example is the basis for the formal definition of a lineal 
spanning tree: 


Definition 20 (Lineal or depth-first spanning tree) Let x and y be two vertices in a 
rooted tree with root r. If x is on the path connecting r to y, we say that y is a descendant 
of x. (In particular, all vertices are descendants of r.) If one of u and v is a descendant of 
the other, we say that {u,v} is a lineal pair. A lineal spanning tree or depth-first spanning 
tree of a connected graph G = (V, E) is a rooted spanning tree of G such that each edge 
{u,v} of G is a lineal pair. 


In our example, vertices {6,7} are not a lineal pair relative to the rooted tree con- 
structed. But {4,7}, which is an edge of G, is a lineal pair. Trivially, the vertices of any 
edge of the tree T form a lineal pair. 


We close this section by proving a theorem using lineal spanning trees. We don’t 
“overexplain” this theorem to encourage you to think about the properties of lineal spanning 
trees that make the proof much simpler than what we might have come up with without 
lineal spanning trees. Recall that a graph G = (V,£) is called bipartite if V can be 
partitioned into two sets C and S' such that each edge has one vertex in C' and one vertex 
in S (Exercises for Section 2). 


Theorem 6 (Bipartite and cycle lengths) Let G = (V,E) bea simple graph. G is 
bipartite if and only if every cycle has even length. 


Proof: If G has a cycle of odd length, label each vertex with the block of some proposed 
bipartite partition {C,S}. For example, if , (a1,72,x73) are the vertices, in some order, 
of a cycle of length three, then the block labels (start with C) would be (C,S,C) This 
would mean that the edge {x,,23} would have both vertices in block C. This violates the 
definition of a bipartite graph. Since this problem happens for any cycle of odd length, a 
bipartite graph can never contain a cycle of odd length. 


To prove the converse, we must show that if every cycle of G has even length, then G 
is bipartite. Suppose every cycle of G has even length. Choose a vertex vo as root of G 
and construct a lineal spanning tree T’ for G with root vp. Label the root v9 of T with C, 
all vertices of T’ of distance 1 from vo with S, all of distance 2 from vg with C, etc. Put 
vertices labeled C' into block C of a partition {C,S} of V, put all other vertices into block 
S. If f = {x,y} is an edge of T then x and y are in different blocks of the partition {C, S} 
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by construction. If f = {x,y} is an edge of G not in T then the two facts (1) T is lineal and 
(2) every cycle has even length, imply that x and y are in different blocks of the partition 
{C,S}. This completes the proof. 0 


Exercises for Section 3 


3.1. 


3.2. 


3.3. 


3.4. 


In this exercise, we study how counting edges and vertices in a graph can establish 
that cycles exist. For parts (a) and (b), let G = (V,E,¢) be a graph with loops 
allowed. 


(a) Using induction on n, prove: 
If n > 0, G is connected and G has v vertices and v + n edges, then G has at 
least n + 1 cycles. 


(b) Prove that, if G has v vertices, e edges and c components, then G has at least 
c+e-—v cycles. 
Hint: Use (a) for each component. 


(c) Show that (a) is best possible, even for simple graphs. In other words, for each 
n construct a simple graph that has n more edges than vertices but has only 
n+ 1 cycles. 


Let T = (V, E) be a tree and let d(v) be the degree of a vertex 
(a) Prove that )7,<¢y(2—d(v)) = 2. 


(b) Prove that, if T has a vertex of degree m > 2, then it has at least m vertices 
of degree 1. 


(c) Give an example for all m > 2 of a tree with a vertex of degree m and only m 
leaves. 


Give an example of a graph that satisfies the specified condition or show that no 
such graph exists. 


(a) A tree with six vertices and six edges 


(b) A tree with three or more vertices, two vertices of degree one and all the other 
vertices with degree three or more. 


c) A disconnected graph with 10 vertices and 8 edges. 


( 
(d 


) 

) A disconnected graph with 12 vertices and 11 edges and no cycle. 
e) A tree with 6 vertices and the sum of the degrees of all vertices 12. 
) 
) 


( 
(f) A connected graph with 6 edges, 4 vertices, and exactly 2 cycles. 


(g) A graph with 6 vertices, 6 edges and no cycles. 


The height of a rooted tree is the maximum height of any leaf. The length of the 
unique path from a leaf of the tree to the root is, by definition, the height of that 


GT-35 


Basic Concepts in Graph Theory 


leaf. A rooted tree in which each non-leaf vertex has at most two children is called 
a binary tree. If each non-leaf vertex has exactly two children, the tree is called a 
full binary tree. 


(a) If a binary tree has / leaves and height h prove that | < 2”. (Taking logarithms 
gives log, (1) < h.) 


(b) A binary tree has / leaves. What can you say about the maximum value of h? 
( 
(d) Given a full binary tree with / leaves, what is the minimum height h? 


) 
c) Given a full binary tree with | leaves, what is the maximum height h? 
) 
(e) Given a binary tree of | leaves, what is the minimum height h? 


3.5. In each of the following cases, state whether or not such a tree is possible. 
(a) A binary tree with 35 leaves and height 100. 
(b) A full binary tree with 21 leaves and height 21. 
(c) A binary tree with 33 leaves and height 5. 
) 


(d) A rooted tree of height 5 where every internal vertex has 3 children and there 
are 365 vertices. 


3.6. For each of the following graphs: 


A B A B A B 


(1) (2) (3) 


a) Find all spanning trees. 


(a) 

(b) Find all spanning trees up to isomorphism. 

(c) Find all depth-first spanning trees rooted at A. 
) 


(d) Find all depth-first spanning trees rooted at B. 


3.7. For each of the following graphs: 


A —" aAg—2_9B Aq 9B 

, “4 
2 

D c D 6 D c 
(1) (2) (3) 


(a) Find all minimum spanning trees. 


(b) Find all minimum spanning trees up to isomorphism. 
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(c) Among all depth-first spanning trees rooted at A, find those of minimum 
weight. 


(d) Among all depth-first spanning trees rooted at B, find those of minimum 
weight. 


3.8. In the following graph, the edges are weighted either 1, 2, 3, or 4. 


Referring to Theorem 5 and the discussion following of Kruskal’s algorithm: 
(a) Find a minimum spanning tree using Prim’s algorithm 
(b) Find a minimum spanning tree using Kruskal’s algorithm. 


(c) Find a depth-first spanning tree rooted at K. 


Section 4: Rates of Growth and Analysis of Algorithms 


Suppose we have an algorithm and someone asks us “How good is it?” To answer that 
question, we need to know what they mean. They might mean “Is it correct?” or “Is it 
understandable?” or “Is it easy to program?” We won’t deal with any of these. 


They also might mean “How fast is it?” or “How much space does it need?” These 
two questions can be studied by similar methods, so we'll just focus on speed. Even 
now, the question is not precise enough. Does the person mean “How fast is it on this 
particular problem and this particular machine using this particular code and this particular 
compiler?” We could answer this simply by running the program! Unfortunately, that 
doesn’t tell us what would happen with other machines or with other problems that the 
algorithm is designed to handle. 


We would like to answer a question such as “How fast is Algorithm 1 for finding a 
spanning tree?” in such a way that we can compare that answer to “How fast is Algorithm 
2 for finding a spanning tree?” and obtain something that is not machine or problem 
dependent. At first, this may sound like an impossible goal. To some extent it is; however, 
quite a bit can be said. 


How do we achieve machine independence? We think in terms of simple machine 
operations such as multiplication, fetching from memory and so on. If one algorithm 
uses fewer of these than another, it should be faster. Those of you familiar with computer 
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instruction timing will object that different basic machine operations take different amounts 
of time. That’s true, but the times are not wildly different. Thus, if one algorithm uses a 
lot fewer operations than another, it should be faster. It should be clear from this that we 
can be a bit sloppy about what we call an operation; for example, we might call something 
like x = a+ 5} one operation. On the other hand, we can’t be so sloppy that we call 
x= a,+--:+a,y one operation if n is something that can be arbitrarily large. 


Example 22 (Finding the maximum) Let’s look at how long it takes to find the 
maximum of a list of n integers where we know nothing about the order they are in or how 
big the integers are. Let a,,...,@n be the list of integers. Here’s our algorithm for finding 


the maximum. 
max = ay, 


For 1 = 2,...,n 
If a; > maz, then maz = a;. 
End for 

Return max 


Being sloppy, we could say that the entire comparison and replacement in the “If” takes 
an operation and so does the stepping of the index 7. Since this is done n — 1 times, we 
get 2n — 2 operations. There are some setup and return operations, say s, giving a total of 
2n —2+-s operations. Since all this is rather sloppy all we can really say is that for large 
nm and actual code on an actual machine, the procedure will take about Cn “ticks” of the 
machine’s clock. Since we can’t determine C' by our methods, it will be helpful to have 
a notation that ignores it. We use O(f(n)) to designate any function that behaves like a 
constant times f(n) for arbitrarily large n. Thus we would say that the “If’ takes time 
O(n) and the setup and return takes time 0(1). Thus the total time is O(n) + O(1). Since 
n is much bigger than 1 for large n, the total time is O(n). OF 


We need to define © more precisely and list its most important properties. We will 
also find it useful to define O, read “big oh.” 


Definition 21 (Notation for 0 and O) Let f, g and h be functions from the positive 
integers to the nonnegative real numbers. We say that g(n) is O(f(n)) if there exist positive 
constants A and B such that Af(n) < g(n) < Bf(n) for all sufficiently large n. In this case 
we say that f and g grow at the same rate. We say that h(n) is O(f(n)) if there exists a 
positive constant B such that h(n) < Bf(n) for all sufficiently large n. In this case we say 
that h grows no faster than f or, equivalently, that f grows at least as fast as h. 


The phrase “S(n) is true for all sufficiently large n” means that there is some integer N 
such that S(n) is true whenever n > N. Saying that something is O(f(n)) gives an idea 
of how big it is for large values of n. Saying that something is O(f(n)) gives an idea of an 
upper bound on how big it is for all large values of n. (We said “idea of” because we don’t 
know what the constants A and B are.) 


Theorem 7 (Some properties of 0 and O) We have 
(a) If g(n) is O(f(m)), then g(n) is O(f(n)). 
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(b) f(n) is O(f(n)) and f(n) is O(f(n)) 
(c) If g(n) is O(f(n)) and C and D are positive constants, then Cg(n) is O(Df(n)). 
If g(n) is O(f(n)) and C and D are positive constants, then Cg(n) is O(Df(n)). 
(d) If g(n) is O(f(n)), then f(n) is O(g(n)). 
(e) If g(n) is O(f(n)) and f(m) is O(h(n)), then g(n) is O(h(n)). 
If g(n) is O(f(m)) and f(n) is O(h(n)), then g(n) is O(h(n)). 
(f) Ifgi(n) is O(f ) is O(f2(n)), then gi (n)+ga(n 


n)), go(n 
) 


( (fi( is O(max(fi(m), fo(n))). 
If gi(n) is O(fi(n (fi n 


ji 
), go(n) is O(fo(n)), then gi(n) + go(m) is O(max(fi(n), fo(n))). 


Note that as a consequence of properties (b), (d) and (e) above, the statement “g(n) is 
O(f(n))” defines an equivalence relation on the set of functions from the positive integers 
to the nonnegative reals. As with any equivalence relation, we can think of it globally as 
partition into equivalence classes or locally as a relation between pairs of elements in the set 
on which the equivalence relation is defined. In the former sense “g(n) is O(f(n))” means 
that “g(n) belongs to the equivalence class O( f(n)) associated with f.” In the latter sense, 
“g(n) is O(f(n))” means g ~e f where ~@ is an equivalence relation called “is ©.” 


Proof: Most of the proofs are left as an exercise. We’ll do (e) for 0. We are given that 
there are constants A; and B; such that 


Aif(n) < g(n) < Bif(n) 


and 
gh(n) < f(n) < Boh(n) 
for all sufficiently large n. It follows that 


Aj Agh(n) < Aif(n) < g(n) < Bif(n) < Bi Boh(n) 


for all sufficiently large n. With A = A,Ag and B = B,Bz, it follows that g(n) is 
O(h(n)). O 


Example 23 (Additional observations on 0 and O) In this example, we have 
collected some additional information about our notation. 


Functions which are not always positive. Our definitions of 6 and O are only for 
functions whose values are nonnegative. The definitions can be extended to arbitrary 
functions by using absolute values; e.g., A] f(m)| < |g(n)| < Bl] f(m)| means g(n) = O(f(n)). 
All the results in the theorem still hold except (f) for ©. This observation is most often 
applied to the case where the function f is “eventually” nonnegative (4M such that Vn > 
M,f(n) > 0). This is the case, for example with any polynomial in n with positive 
coefficient for the highest power of n. 


Taking limits. When comparing two well-behaved functions f(n) and g(n), limits can 
be helpful: 
lim g(r) = 


C>0O implies g(n) is O(f(n)) 
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and 

lim g(r) =C>0 implies g(n) is O(f(n)). 

noo f(n) 
We assume here that the function f is never zero past some integer N so that the ratio is 
defined. The constants A and B of the definition can, in the first case, be taken to be C —«€ 
and C + ¢, where € is any positive number (¢€ = 1 is a simple choice). In the second case, 
take B to be C+. If, in the first case, C = 1, then f and g are said to be asymptotic or 
asymptotically equal. This is written f ~ g. If, in the second case, C' = 0, then g is said 
to be little oh of f (written g = o(f)). We will not use the “asymptotic” and “little oh” 
concepts. 


Polynomials. In particular, you can take any polynomial as f(n), say f(n) = apn® + 
--» + ag, and any other polynomial as g(n), say g(n) = byn® +---+ bo. For f and g to 
be eventually positive we must have both a, and by positive. If that is so, then g(n) is 
O(f(n)). Note in particular that we must have g(n) is O(n*). 


Logarithms. Two questions that arise concerning logarithms are (a) “What base should 
I use?” and (b) “How fast do they grow?” 

The base does not matter because log, x = (log, b)(log, x) and constant factors like 
log, b are ignored in O( ) and O( ). 

It is known from calculus that logn — oo as n — co and that lim,_...(logn)/n* = 0 
for every € > 0. Thus logarithms grow, but they grow slower than powers of n. For example, 
nlogn is O(n3/2) but n3/? is not O(nlogn). 


A proof. How do we prove 
lim o =C>0 implies g(n) is O(f(n))? 


By definition, the limit statement means that for any € > 0 there exists N such that for all 
n> N, |2%-C| <e. Ife >C, replace it with a smaller «. From cai —C| <e, for all 


o-6< 9 cote or (C—e)f(n) < =~ < (C+ e)f(n). 


Take A = (C —e) and B = (C' +.¢) in the definition of 0. O 


Example 24 (Using ©) To illustrate these ideas, we’ll consider three algorithms for 
evaluating a polynomial p(x) of degree n at some point r; i.e., computing po +pir+::-+pyr”. 
We are interested in how fast they are when n is large. Here are the procedures. You should 
convince yourself that they work. 


Polyl(n, p,r) 


S = po 
For i= 1,...,n S = S + p; * Pow(r,i). 
Return S$ 

End 
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Pow/(r, 7) 
P=1 
For 7 =1,...,n P=Prxr. 
Return P 
End 
Poly2(n, p,r) 
S = po 
P= 
For i =1,...,n 
P=Pkrr. 
S=S+p,*P 
End for 
Return S$ 
End 
Poly3(n, p,r) 
S = pn 
Fori=n,...,2,.1 S=S*r+p 4-1 
Return S$ 
End 


Let T,,(Name) be the time required for the procedure Name. Let’s analyze Polyl. The 
“For” loop in Pow is executed 7 times and so takes Ci operations for some constant C’. 
The setup and return in Pow takes some constant number of operations D. Thus T;,(Pow) = 
Ci+D operations. As a result, the ith iteration of the “For” loop in Polyl takes Ci + E 
operations for some constants C and £ > D. Adding this over i = 1,2,...,n, we see that 
the total time spent in the “For” loop is O(n”) since )7)_, i = n(n + 1)/2. (This requires 
using some of the properties of 8. You should write out the details.) Since the rest of 
Poly1 takes O(1) time, T;,(Poly1) is O(n?). 


The amount of time spent in the “For” loop of Poly2 is constant and the loop is 
executed n times. It follows that T,,(Poly2) is O(n). The same analysis applies to Poly3. 


What can we conclude from this about the comparative speed of the algorithms? 
By the definition of ©, there are positive reals A and B so that An? < T,,(Polyl) and 
T,(Poly2) < Bn for sufficiently large n. Thus T;,(Poly2)/T,,(Poly1) < B/An. As n gets 
larger, Poly2 looks better and better compared to Polyl. 


Unfortunately, the crudeness of © does not allow us to make any distinction between 
Poly2 and Poly3. What we can say is that T;,(Poly2) is O(T;,(Poly3)); i.e., T,(Poly2) and 
T,(Poly3) grow at the same rate. A more refined estimate can be obtained by counting the 
actual number of operations involved. J 


So far we have talked about how long an algorithm takes to run as if this were a simple, 
clear concept. In the next example we'll see that there’s an important point that we’ve 
ignored. 


*Example 25 (What is average running time?) Let’s consider the problem 
of (a) deciding whether or not a simple graph can be properly colored with four colors 
and, (b) if a proper coloring exists, producing one. A proper coloring of a simple graph 
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G = (V,£) is a function \: V — C, the set of “colors,” such that, if {u,v} is an edge, then 
A(u) # A(v). We may as well assume that V = n and that the colors are c;, C2, cz and cq. 


Here’s a simple algorithm to determine a \ by using backtracking to go lexicographi- 
cally through possible colorings \(1), A(2),..., A(m). 
1. Initialize: Set v = 1 and A(1) = cy. 
2. Advance in decision tree: If v = n, stop with A determined; otherwise, set 
v=v+land XA(v) =c1. 
3. Test: If A(i) # A(v) for all i < v for which {i,v} € E, go to Step 2. 


4. Select next decision: Let j be such that \(v) = c;. If j < 4, set A(v) = cj41 
and go to Step 3. 


5. Backtrack: If v = 1, stop with coloring impossible; otherwise, set v = v — 1 and 
go to Step 4. 


How fast is this algorithm? Obviously it will depend on the graph. Here are two 
extreme cases: 


e Suppose the subgraph induced by the first five vertices is the complete graph Ks (i-e., 
all of the ten possible edges are present). The algorithm stops after trying to color the 
first five vertices and discovering that there is no proper coloring. Thus the running 
time does not depend on n and so is in @(1). 


Suppose that the first n —5 vertices have no edges and that the last five vertices induce 
Ks. The algorithm tries all possible assignments of colors to the first n — 5 vertices 
and, for each of them, discovers that it cannot properly color the last five because they 
form Ks. Thus the algorithm makes between 4"~° and 4” assignments of colors and 
so its running time is O(4") — a much faster growing time than O(1). 


What should we do about studying the running time of such an algorithm? It’s reason- 
able to talk about the average time the algorithm takes if we expect to give it lots of graphs 
to look at. Most n vertex graphs will have many sets of five vertices that induce Ks. (We 
won’t prove this.) As a result, the algorithm has running time in O(1) for most graphs. In 
fact, it can be proved that the average number of assignments of the form A(v) = cz that 
are made is O(1) and so the average running time is O(1). This means that the average 
running time of the algorithm is bounded for all n, which is quite good! 


Now suppose you give this algorithm to a friend, telling her that the average running 
time is bounded. She thanks you profusely for such a wonderful algorithm and puts it to 
work coloring randomly generated “planar” graphs. These are a special class of graphs 
whose pictures can be drawn in the plane without edges crossing each other. (All trees 
are planar, but Ks is not planar.) By a famous theorem called the Four Color Theorem, 
every planar graph can be properly colored with four colors, so the algorithm will find the 
coloring. To do so it must make assignments of the form A(v) = cz for each vertex v. Thus 
it must make at least n assignments. (Actually it will almost surely make many, many 
more.) Your friend soon comes back to you complaining that your algorithm takes a long 
time to run. What went wrong? 


You were averaging over all simple graphs with n vertices. Your friend was averaging 
over all simple planar graphs with n vertices. The average running times are very different! 
There is a lesson here: 


You must be VERY clear what you are averaging over. 


GT-42 


Section 4: Rates of Growth and Analysis of Algorithms 


Because situations like this do occur in real life, computer scientists are careful to specify 
what kind of running time they are talking about; either the average of the running time 
over some reasonable, clearly specified set of problems or the worst (longest) running time 
over all possibilities. J 


You should be able to see that saying something is O( ) leaves a lot out because we 
have no idea of the constants that are omitted. How can we compare two algorithms? Here 
are two rules of thumb. 


e If one algorithm is O(f(n)) and the other is O(g(n)), the algorithm with the slower 
growing function (f or g) is probably the better choice. 


e If both algorithms are O(f(n), the algorithm with the simpler data structures is 
probably better. 


These rules are far from foolproof, but they provide some guidance. 


*Polynomial Time Algorithms 


Computer scientists talk about “polynomial time algorithms.” What does this mean? 
Suppose that the algorithm can handle arbitrarily large problems and that it takes O(n) 
seconds on a problem of “size” n. Then we call it a linear time algorithm. More generally, 
if there is a (possibly quite large) integer k such that the worst case running time on a 
problem of “size” n is O(n*), then we say the algorithm is polynomial time. 


You may have noticed the quotes around size and wondered why. It is necessary to 
specify what we mean by the size of a problem. Size is often interpreted as the number of 
bits required to specify the problem in binary form. You may object that this is imprecise 
since a problem can be specified in many ways. This is true; however, the number of bits in 
one “reasonable” representation doesn’t differ too much from the number of bits in another. 
We won’t pursue this further. 


If the worst case time for an algorithm is polynomial, theoretical computer scientists 
think of this as a good algorithm. (This is because polynomials grow relatively slowly; for 
example, exponential functions grow much faster.) The problem that the algorithm solves 
is called tractable. 


Do there exist intractable problems; i.e., problems for which no polynomial time algo- 
rithm can ever be found? Yes, but we won’t study them here. More interesting is the fact 
that there are a large number of practical problems for which 


e no polynomial time algorithm is known and 
e no one has been able prove that the problems are intractable. 
We'll discuss this a bit. 
Consider the following problems. 


e Coloring problem: For any c > 2, devise an algorithm whose input can be any 
simple graph and whose output answers the question “Can the graph be properly 
colored in ¢ colors?” 


GT-43 


Basic Concepts in Graph Theory 


e Traveling salesman problem: For any B, devise an algorithm whose input 
can be any n > 0 and any real valued edge labeling, A: P2(n) — R, for K,, the 
complete graph on n vertices. The algorithm must answer the question “Is there 
a cycle through all n vertices with cost B or less?” (The cost of a cycle is the sum 
of A(e) over all e in the cycle.) 


e Clique problem: Given a simple graph G = (V, F) and an integer s, is there a 
subset S CV, |S| = s, whose induced subgraph is the complete graph on S' (i.e., 
a subgraph of G with vertex set S and with (5) edges)? 


No one knows if these problems are tractable, but it is known that, if one is tractable, then 
they all are. There are hundreds more problems that people are interested in which belong 
to this particular list in which all or none are tractable. These problems are called NP- 
complete problems Many people regard deciding if the NP-complete problems are tractable 
to be the foremost open problem in theoretical computer science. 


The NP-complete problems have an interesting property which we now discuss. If the 
algorithm says “yes,” then there must be a specific example that shows why this is so (an 
assignment of colors, a cycle, an automaton). There is no requirement that the algorithm 
actually produce such an example. Suppose we somehow obtain a coloring, a cycle or an 
automaton which is claimed to be such an example. Part of the definition of NP-complete 
requires that we be able to check the claim in polynomial time. Thus we can check a 
purported example quickly but, so far as is known, it may take a long time to determine if 
such an example exists. In other words, I can check your guesses quickly but I don’t know 
how to tell you quickly if any examples exist. 


There are problems like the NP-complete problems where no one knows how to do any 
checking in polynomial time. For example, modify the traveling salesman problem to ask 
for the minimum cost cycle. No one knows how to verify in polynomial time that a given 
cycle is actually the minimum cost cycle. If the modified traveling salesman problem is 
tractable, so is the one we presented above: You need only find the minimum cost cycle 
and compare its cost to B. Such problems are called NP-hard because they are at least as 
hard as NP-complete problems. A problem which is tractable if the NP-complete problems 
are tractable is called NP-easy. 


Some problems are both NP-easy and NP-hard but may not be NP-complete. Why is 
this? NP-complete problems must ask a “yes/no” type of question and it must be possible 
to check a specific example in polynomial time as noted in the previous paragraph. We 
discuss an example. 


*Example 26 (Chromatic number) The chromatic number x(G) of a graph G is the 
least number of colors needed to properly color G. The problem of deciding whether a graph 
can be properly colored with c colors is NP-complete. The problem of determining .(G) 
is NP-hard. If we know y(G), then we can determine if c colors are enough by checking if 
c>x(G). 


The problem of determining y(G) is also NP-easy. You can color G with c colors if 
and only if ¢ > x(G). We know that 0 < x(@) < n for a graph with n vertices. Ask if c 
colors suffice for c = 0,1,2,.... The least c for which the answer is “yes” is y(G). Thus the 
worst case time for finding .(G) is at most n times the worst case time for the NP-complete 
problem. Hence one time is O of a polynomial in n if and only if the other is. O 
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What can we do if we cannot find a good algorithm for a problem? There are three 
main types of partial algorithms: 


1. Almost good: It is polynomial time for all but a very small subset of possible 
problems. (If we are interested in all graphs, our coloring algorithm in Example 25 
is almost good for any fixed c.) 


2. Almost correct: It is polynomial time but in some rare cases does not find the 
correct answer. (If we are interested in all graphs and a fixed c, automatically 
reporting that a large graph can’t be colored with c colors is almost correct — 
but it is rather useless.) In some situations, a fast almost correct algorithm can 
be useful. 


3. Close: It is a polynomial time algorithm for a minimization problem and comes 
close to the true minimum. (There are useful close algorithms for approximating 
the minimum cycle in the Traveling Salesman Problem.) 


Some of the algorithms make use of random number generators in interesting ways. 
Unfortunately, further discussion of these problems is beyond the scope of this text. 


*A Theorem for Recursive Algorithms 


Some algorithms, such as merge sorting, call themselves. This is known as a recursive 
algorithm or a divide and conquer algorithm. 


When we try estimate the running time of such algorithms, we obtain a recursion. In 
Section 2 of Unit DT, we examined the problem of solving recursions. We saw that finding 
exact solutions to recursions is difficult. The recursions that we obtain for algorithms are 
not covered by the methods in that section. Furthermore, the recursions are often not 
known exactly because we may only be able to obtain an estimate of the form O( ) for 
some of the work. The next example illustrates this problem. 


*Example 27 (Sorting by recursive merging) Given a list L of n items, we wish to 
sort it. Here is the merge sorting algorithm from Section 3 of Unit DT. 


Sort(L) 
If length is 1, return L 
Else 
Split L into two lists L1 and L2 
S1 = Sort(L1) 
S2 = Sort(L2) 
S = Merge(L1, L2) 
Return $ 
End if 
End 


We need to be more specific about how the lists are split. Let m be n/2 rounded down, let 
L1 be the first m items in L and let L2 be the last n — m items in L. 
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One way to measure the running time of Sort(L) is to count the number of comparisons 
that are required. Let this number be T(n). We would like to know how fast T(n) grows 
as a function of n so we can tell how good the algorithm is. For example, is T(n) = O(n)? 
is T(n) = O(n?)? or does it behave differently? 


We now start work on this problem. Since the sorting algorithm is recursive (calls 
itself), we will end up with a recursion. This is a general principle for recursive algorithms. 
You should see why after the next two paragraphs. 


All comparisons are done in Merge(L1,L2). It can be shown that the number of 
comparisons in Merge is between m and n — 1. We take that fact as given. 


Three lines of code are important: 


S1 = Sort(L1) a recursive call, so it gives us T(m); 
S2 = Sort(L2) a recursive call, so it gives us T(n — m); 
S = Merge(L1, L2) where the comparisons are, so it gives us ay 


with m<a,<n=1. 
We obtain T(n) = T(m)+T(n—m) +a, where all we know about a, is that it is between 
m and n — 1. What can we do? 


Not only is this a type of recursion we haven’t seen before, we don’t even know the 
recursion fully since all we have is upper and lower bounds for a,. The next theorem solves 
this problem for us. 0] 


The following theorem provides an approximate solution to an important class of ap- 
proximate recursions that arise in divide and conquer algorithms. We’ll apply it to merge 
sorting. In the theorem 


e T(n) is the running time for a problem of size n. 


e If the algorithm calls itself at w places in the code, then the problem is divided into 
w smaller problems of the same kind and s1(n),...,Sw(n) are the sizes of the smaller 
problems. 


e The constant c measures how much smaller each of these problems is. 


e The time needed for the rest of the code is a,,. 


*Theorem 8 (Master Theorem for Recursions*) Suppose that there are 
(i) numbers N, b, w > 1 and0<c< 1 that do not depend on n 
(ii) a sequence a1, 42,..., 
(iii) functions $1, 82,...,Sw, and T 
such that 
(a) T(n) > 0 for alln > N anda, > 0 for alln > N; 
* This is not the most general version of the theorem; however, this version is easier to 


understand and is usually sufficient. For a more general statement and a proof, see any 
thorough text on the analysis of algorithms. 
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(b) T(n) =a, + T(si(n)) + T(so(n)) +--+ +7 (swu(n)) for alln > N; 
(c) ay is O(n) (If ap, =0 for all large n, set b = —oo.); 
(d) |s;(n) — cn| is O(1) fori = 1,2,...,w. 

Let d = — log(w)/log(c). Then 


O(n?) if b <d, 
T(n) is ¢ O(n4logn) ifb=d, 
O(n?) ifb>d. 


Note that b = 0 corresponds to a, being in (1) since n° = 1. In other words, ay is 
bounded by nonzero constants for all large n: 0 < Cy < ayn < Co. 


Let’s apply the theorem to our recursion for merge sorting: 
T(n) = an + T(si(n)) + T(s2(n)) 


where 
si(n) = |n/2|, se(n)=|n-—n/2| and si(n)<a,<n-1. 


Note that s;(n) and s9(n) differ from n/2 by at most 1/2 and that a, = O(n). Thus we 
can apply the theorem with w = 2, b=1 and c= 1/2. We have 


d = — log(2)/log(1/2) = log(2)/log(2) = 1. 
Since b = d= 1, we conclude that T(n) is O(nlogn). 


How do we use the theorem on divide and conquer algorithms? First, we must find 
a parameter n that measures the size of the problem; for example, the length of a list to 
be sorted, the degree of polynomials that we want to multiply, the number of vertices in a 
graph that we want to study. Then use the interpretation of the various parameters that 
was given just before the theorem. 


Our final example is more difficult because the algorithm that we study is more com- 
plicated. It was believed for some time that the quickest way to multiply polynomials was 
the “obvious” way that is taught when polynomials are first studied. That is not true. The 
next example contains an algorithm for faster multiplication of polynomials. There are also 
faster algorithms for multiplying matrices. 


*Example 28 (Recursive multiplication of polynomials) Suppose we want to multiply 
two polynomials of degree at most n, say 


P(x) = po tpix +--+ + pynx” and Q(x) =qturt---+qnz”. 


The natural way to do this is to use the distributive law to generate (n + 1)? products 
p0do, PoU£, Poq27,.-+, Pndnx?” and then collect the terms that have the same powers of 
x. This involves (n + 1)? multiplications of coefficients and, it can be shown, n? additions 
of coefficients. Thus, the amount of work is O(n”). Unless we expect P(a) or Q(x) to have 
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some coefficients that are zero, this seems to be best we can do. Not so! We now present 
and analyze a faster recursive algorithm. 


The algorithm depends on the following identity which you should verify by checking 
the algebra. 


Identity: If Pr(x), Px(x), Qx(x) and Qx(zx) are polynomials, then 
(Pr (2) + Palla”) (Qr(z) + Qu(x)z"™) = A(x) + (C(z) — A(x) - Bae” + B(x)x? 


where 
A(x) = Pr(x)Qi(z), B(x) = Pu(2)Qun(z), 
and 


C(x) = (Pr(x) + Pu(2)) (Qz(@) + Qu(2)) 


We can think of this identity as telling us how to multiply two polynomials P(x) and 
Q(x) by splitting them into lower degree terms (Pz (xz) and Q;(x)) and higher degree terms 
(Pa(x)x™ and Qy(x)x™): 


P(@) = D(Pr(@) + Pu(a)a™ and Q(a) = Q1(#) + Qu(a)a™ 


The identity requires three polynomial multiplications to compute A(x), B(x) and C(z). 
This leads naturally to two questions: 


e Haven’t things gotten worse — three polynomial multiplications instead of just one? 
No. The three multiplications involve polynomials of much lower degrees. We will see 
that this leads to a gain in speed. 


e How should we do these three polynomial multiplications? Apply the identity to each 
of them. In other words, design a recursive algorithm. We do that now. 


Here is the algorithm for multiplying two polynomials P(x) = pp + pix-+++pyx” and 
Q(x) =qt+uxrt+--:+qnx” of degree at most n. 


MULT(P(z), Q(x), n) 
If (n=0) Return pogo 
Else 
Let m = n/2 rounded up. 
Pps) po Fie +s Paar 


Px(%) = Dm + Dm4it +++ Pye™—™ 
Qr(#) = go +e +++: dm ee 
Qu(x) = ae ee. pe 
A(x) = MULT (Pr — 1) 
B(x) = MULT ae r - a 
C(x) = MULT ( fa nee xr), Qr(x) + Quy(z), n—m) 
D(x) = A(x) + (C(z) — A(x) — B(x a" + B(x)x?™ 
Return D(z) 
End if 


As is commonly done, we imagine a polynomial stored as a vector of coefficients. The 
amount of work required is then the number of times we have two multiply or add two 
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coefficients. For simplicity, we just count multiplications. Let that number be T(n). You 
should be able to see that T(0) = 1 and 


T(n) =T(m-1)+T(n-—m)+T(n-—m) for n>0. 
We can write this as 
T(n) =T(m—-1)+ T(n-—m)+T(n-—m)+an, a =1 anda, = 0 forn> 0. 
Note that, since both m — 1 and n — m differ from n/2 by at most 1, w = 3 and c = 1/2. 


Also 6 = —oo. 


We have d = log3/log2 > b. Thus T(n) is Q(n!°83/!982). Since log 3/log2 is about 
1.6 which is less than 2, this is less work than the straightforward method when n is large 
enough. (Recall that the work there was in @(n?).) O 


Exercises for Section 4 


4.1. We have three algorithms for solving a problem for graphs. Suppose algorithm A 
takes n? milliseconds to run on a graph with n vertices, algorithm B takes 100n 
milliseconds and algorithm C takes 100(2”/!° — 1) milliseconds. 


(a) Compute the running times for the three algorithms with n = 5, 10, 30, 100 
and 300. Which algorithm is fastest in each case? slowest? 


(b) Which algorithm is fastest for all very large values of n? Which is slowest? 


4.2. Let p(x) be a polynomial of degree & with positive leading coefficient and suppose 
that a > 1. Prove the following. 


(a) O(p(n)) is O(n"). 

(b) O(p(n)) is O(n*). 

(c) lim p(n)/a” = 0. (Also, what does this say about the speed of a polynomial 
time algorithm versus one which takes exponential time?) 


(d) Unless p(x) = p:2* + pz for some p; and po, there is no C' such that a?(”) is 
@(aCn"). 


4.3. In each case, prove that g(n) is O(f(n)) using the definition of “g is O(f)”. (See 
Definition 21.) 


(a) g(n) =n3+5n? +10, f(n) = 20n°. 
(b) g(n) =n? + 5n? +10, f(n) = 


4.4. In each case, show that the given series has the indicated property. 
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(a) 352) 77 is (3). 
(b) DL, # is O(n4), 
(c) 22 2/7? is O(n3/2). 


4.5. Show each of the following 
(a) }o"., 271 is O(log, (n)) for any base b > 1. 
(b) log,(n!) is O(n log,(n)) for any base b > 1. 
(c) n! is O((n/e)"+1/?). 


*4.6. The following algorithm multiplies two n xn matrices A and B and puts the answer 
in C. Let T(n) be the running time of the algorithm Find a simple function f(n) 
so that is O(f(n)). 


MATRIXMULT(n,A,B,C) 
For i=1,...,n 
For j=1,...,n 
C(i,j)=0 
For k=1,....n 
C(ij) = C(i,j) + AGk)*B(kj) 
End for 
End for 
End for 
End 


*4.7. The following algorithm computes x” for n a positive integer, where x is a com- 
plicated object (e.g., a large matrix). MULT(2,y) is a procedure that multiplies 
two such objects. Let T'(n) be the number of times MULT is called. Find a simple 
function f(n) so that T(n) is O(f(n)). 


POW(z, n) 
If (n=1) Return x 
Else 
Let q be n/2 rounded down and r = n — 2q. 
y = MULT(z, z) 
z= POW(y, g) 
If (r=0) Return z 
Else 
w = MULT(z, z) 
Return w 
End if 
End if 
End 
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Multiple Choice Questions for Review 


Some of the following questions assume that you have done the exercises. 


1. Indicate which, if any, of the following five graphs G = (V,E,¢), |V| = 5, is not 
isomorphic to any of the other four. 


A B C D E F 
(a) o= (ea {2,4} {1,2} {2,3} {3,5} {4,5} 
b Cc d e a 
(b) 6= ee {1,2} {2,3} {3,4} {3.4} {4.5} 


b e d Cc a 
(c) ¢= Oe (i's) (13) 23} (24) (405) 


1 2 3 4 5 6 
(d) = Ce {2,3} {2,3} {3,4} {4,5} {4,5} 


b a e d Cc 
(e) ¢= (ao (1,3) {1,3} {2,3} {2,5} a) 


2. Indicate which, if any, of the following five graphs G = (V,E,¢), |V| = 5, is not 
connected. 


(a) ¢= a (1.2 3.3) 13,4} {1,5} ae) 
(b) g= bide 3} 13} 2.3} (25) sy) 
(c) d= Ce as {1.3} 2,3} (2,4} i) 
(d) d= ee oe 1.3} 12.3} {3,4} ed 


= a b c d e 
(e) o= ee {2.3} {1.2} {1.3} (2.3) fest) 


3. Indicate which, if any, of the following five graphs G = (V,E,¢), |V| = 5, have an 
Eulerian circuit. 


(a) ¢= (x os (2.3) joa} (a8) i) 
(b) d= (25 ax {1.3} 2.3} (2,4} ia) 
(c) d= (ae net 3.9) 13,4} {4.5} oS) 
(d) g= Ga {1.3} {1.3} (2.3} 2.5} pay 


a b Cc d e 
(e) @= Gre {3.4} {1.2} {2,3} (3.5) ia) 


4. A graph with V = {1,2,3,4} is described by ¢ = ( 


How many Hamiltonian cycles does it have? 


a b c d e f 
{1,2} {1,2} {1,4} {2,3} {3,4} rae ‘7 
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5. A graph with V = {1,2,3,4} is described by ¢ = ie 
abed 
3.2 1 2 


b Cc d e f ) 
2} {1,2} {1,4} {2,3} {3,4} {3,4} ) * 
It has weights on its edges given by A = ( : : . How many minimum 


spanning trees does it have? 


6. Define an RP-tree by the parent-child adjacency lists as follows: 
(i) Root B: J, H, K; (ii) H: P,Q, R; (iii) Q:S,T; (iv) K: L, M,N. 


The postorder vertex sequence of this tree is 
(a) J, P, S, T, Q, R, H, L, M,N, kK, B. 
(b) 2 8,0 J,-Q.-R. HLM. No K. B: 
(c) P, S, T, Q, R, H, L, M, N, K, J, B. 
(d)° PS. 0; ©} Ry Je M,N, KB; 
(e) S, T, Q, J, P, R, H, L, M, N, K, B. 


7. Define an RP-tree by the parent-child adjacency lists as follows: 
(i) Root B: J, H, K; (ii) J: P,Q, R; (iii) Q: S$, T; (iv) K: L, M, N. 


The preorder vertex sequence of this tree is 
(a) B,J, HK, P,Q, Ry Ly M,N, 8, -T- 
(b) B, J, P, Q, S, T, R, H, K, L, M, N. 
(c) B, J, P, Q, S, T, R, H, L, M, N, K. 
(d) Bo, PP) 5s T, R,..H,., Ma No K. 
(e) B, J, Q, 5S, T, P, R, H, K, L, M, N. 
8. For which of the following does there exist a graph G = (V, E, ¢) satisfying the specified 


conditions? 


) A tree with 9 vertices and the sum of the degrees of all the vertices 18. 


c) A graph with 5 components 30 vertices and 24 edges. 


(a 
(b) A graph with 5 components 12 vertices and 7 edges. 
d) A graph with 9 vertices, 9 edges, and no cycles. 

e 


( 
( 
(e) A connected graph with 12 edges 5 vertices and fewer than 8 cycles. 


9. For which of the following does there exist a simple graph G = (V,£) satisfying the 
specified conditions? 


(a) It has 3 components 20 vertices and 16 edges. 


(b) It has 6 vertices, 11 edges, and more than one component. 
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16. 


Review Questions 


(c) It is connected and has 10 edges 5 vertices and fewer than 6 cycles. 

(d) It has 7 vertices, 10 edges, and more than two components. 

(e) It has 8 vertices, 8 edges, and no cycles. 

For which of the following does there exist a tree satisfying the specified constraints? 
(a) A binary tree with 65 leaves and height 6. 

(b) A binary tree with 33 leaves and height 5. 


c) A full binary tree with height 5 and 64 total vertices. 


( 
(d) A full binary tree with 23 leaves and height 23. 


) 

) 

) 

(e) A rooted tree of height 3, every vertex has at most 3 children. There are 40 total 
vertices. 

For which of the following does there exist a tree satisfying the specified constraints? 

(a) A full binary tree with 31 leaves, each leaf of height 5. 


(b) A rooted tree of height 3 where every vertex has at most 3 children and there are 
41 total vertices. 


(c) A full binary tree with 11 vertices and height 6. 

(d) A binary tree with 2 leaves and height 100. 

(e) A full binary tree with 20 vertices. 

The number of simple digraphs with |V| = 3 is 

(Ay? (b) 28 (c) 27 (d) 2° (e) 2° 

The number of simple digraphs with |V| = 3 and exactly 3 edges is 
(a) 92 (b) 88 (c) 80 (d) 84 (e) 76 

The number of oriented simple graphs with |V| = 3 is 

(a) 27 (b) 24 (c) 21 (d) 18 (e) 15 

The number of oriented simple graphs with |V| = 4 and 2 edges is 
(a) 40 (b) 50 (c) 60 (d) 70 (e) 80 


In each case the depth-first sequence of an ordered rooted spanning tree for a graph 
G is given. Also given are the non-tree edges of G. Which of these spanning trees is a 
depth-first spanning tree? 


(a) 123242151 and {3,4}, {1,4} 
(b) 123242151 and {4,5}, {1,3} 
(c) 123245421 and {2,5}, {1,4} 
(d) 123245421 and {3,4}, {1,4} 
) 


(c) 123245421 and {3,5}, {1,4} 
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7. a is 
(a) O((In(n))"/7) — (b) O(n))_— (c) O(n"”?)_— (d) O(n?/?)_—(e) O(n?) 


18. Compute the total number of bicomponents in all of the following three simple graphs, 
G = (V,E) with |V| = 5. For each graph the edge sets are as follows: 


B= 12h, 42,.3), 43,4 14 SRA sya bh 13,0F} 
BST{1, 2), 42,3), 43,41, {4,5}, 41 34} 
B= {41,2),42,3}) 44:5); 41,3} } 
(a) 4 (b) 5 (c) 6 (d) 7 (e) 8 
19. Let b> 1. Then log,((n?)!) is 
(a) O(log, (n!)) 


(b) O(log, (2 n!)) 
(c) O(nlog,(n)) 
(d) O(n? log,(n)) 


(ec) O(nlog,(n7)) 


20. What is the total number of additions and multiplications in the following code? 


s:=0 

fori:=lton 
s=st+i 
for j:= 1 toi 

s:=s+ j*i 

next j 

next i 

s :=s+10 
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